Домашнее задание №2¶

Выполнил Черкасов Борис Юрьевич, студент 416 группы ВМК МГУ имени М. В. Ломоносова

Векторные модели временных рядов: VAR, VARMAX, VECM. Нужно выбрать акции (можно как на российском, так и на американском рынке), по 3 акции из 3 секторов экономики и построить прогнозы для них на 10/20/100 торговых сессий вперед. Не забывайте про выводы при использовании каждой модели и предварительных вычислений для каждой модели.¶

Дедлайн 23 ноября 23:59

Возьмем данные с Yahoo Finance, есть данные по акциям США, России.¶

В качестве секторов экономики возьмем следующие:

  • Россия:

    • Финансы (Банки): SBER.ME, VTBR.ME, TCSG.ME;
    • Нефтегаз: GAZP.ME, LKOH.ME, ROSN.ME;
    • Металлургия: NLMK.ME, GMKN.ME, CHMF.ME.
  • США:

    • Tech: APPL, MSFT, NVDA;
    • Retail: WMT, COST, TGT;
    • Energy: XOM, CVX, COP.

P.S. Как оказалось, для банков в России с yfinance для объединения записей, имеются данные с 2019 года до 2022, поэтому по сравнению с другими наборами данных, этот сектор будет иметь меньше записей, предположу, что 627 записей не будет критично для исследования ;)

In [26]:
# Требуемые библиотеки
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import time
import warnings
warnings.filterwarnings('ignore')

# статистика
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.statespace.varmax import VARMAX
from statsmodels.tsa.vector_ar.vecm import coint_johansen, VECM
from statsmodels.tools.eval_measures import rmse, mse
import yfinance as yf

Загрузка данных¶

Загрузим данные об акциях, используя тикеры, которые выше. Возьмем период с текущей даты на 10 лет в прошлое....

In [27]:
# Пример скачивания данных для Тинькоффа
import yfinance as yf
df = yf.download("TCSG.ME", period="5y", threads=False, progress=False)
print(df.tail())
Price        Close    High     Low    Open  Volume
Ticker     TCSG.ME TCSG.ME TCSG.ME TCSG.ME TCSG.ME
Date                                              
2022-05-18  2402.0  2549.5  2360.0  2432.0  475508
2022-05-19  2236.5  2400.0  2205.0  2400.0  325695
2022-05-20  2236.5  2236.5  2236.5  2236.5       0
2022-05-23  2236.5  2236.5  2236.5  2236.5       0
2022-05-24  2236.5  2236.5  2236.5  2236.5       0
In [28]:
# save_each_ticker.py
import os

# Параметры
OUTPUT_DIR = "market_data"           # папка для csv
os.makedirs(OUTPUT_DIR, exist_ok=True)
PERIOD = "10y"                        # можно "max" или "10y"
RETRIES = 6
PAUSE = 1                            # базовая пауза для экспоненциального backoff
TICKERS = [
    # Россия: финансы, нефть/газ, металлургия
    "SBER.ME","VTBR.ME","TCSG.ME",
    "GAZP.ME","LKOH.ME","ROSN.ME",
    "NLMK.ME","GMKN.ME","CHMF.ME",
    # США: tech, retail, energy
    "AAPL","MSFT","NVDA",
    "WMT","COST","TGT",
    "XOM","CVX","COP"
]

def save_df_to_csv(df, ticker):
    """Сохраняет DataFrame в CSV с датой записи; возвращает путь."""
    fname = f"{ticker.replace('/','_')}.csv"
    path = os.path.join(OUTPUT_DIR, fname)
    df.to_csv(path)
    return path

def download_single(ticker, period=PERIOD, retries=RETRIES, pause=PAUSE):
    """
    Скачивает один тикер с yfinance, с retries, threads=False и экспоненциальным backoff.
    Возвращает DataFrame или None.
    """
    for attempt in range(1, retries+1):
        try:
            df = yf.download(ticker, period=period, threads=False, progress=False)
            if df is None or df.empty:
                print(f"⚠ {ticker}: пустой DataFrame (попытка {attempt}/{retries})")
            else:
                # Заменим локальную дату на индекс DatetimeIndex, отфильтруем только Adjusted Close
                if 'Adj Close' in df.columns:
                    # сохраняем весь df (OHLCV и Adj Close) — удобнее для анализа
                    print(f"✅ {ticker}: успешно загружен (attempt {attempt}) — {len(df)} строк")
                else:
                    print(f"ℹ {ticker}: загружен, но колонки 'Adj Close' отсутствует (attempt {attempt})")
                return df
        except Exception as e:
            print(f"❌ {ticker}: ошибка при загрузке (attempt {attempt}/{retries}) — {e}")

        # backoff перед следующей попыткой
        sleep_time = pause * (2 ** (attempt-1))
        print(f"   ...ждём {sleep_time}s и повторяем")
        time.sleep(sleep_time)

    # если дошли сюда — не удалось
    print(f"⛔ {ticker}: НЕ удалось загрузить после {retries} попыток")
    return None

def try_moex_fallback(ticker):
    """
    Простейшая подсказка: для российских тикеров можно использовать MOEX ISS.
    Здесь не реализован полный парсер — вместо этого сообщим, как вручную попробовать.
    (Если нужно, могу дополнить скрипт полной загрузкой через MOEX API.)
    """
    print(f"→ fallback: если {ticker} российский и не грузится, используйте MOEX ISS API или библиотеку 'moex'.")

# main loop: скачиваем по одному и сохраняем
failed = []
for t in TICKERS:
    print(f"\n=== Скачиваем {t} ===")
    out_path = os.path.join(OUTPUT_DIR, f"{t.replace('/','_')}.csv")
    # если уже есть файл — пропускаем (комментируй, если нужно перезагрузить)
    if os.path.exists(out_path):
        print(f"Файл {out_path} уже есть — пропускаю (удалите файл, чтобы перекачать).")
        continue

    df = download_single(t)
    if df is not None and not df.empty:
        saved = save_df_to_csv(df, t)
        print(f"Сохранено в {saved}")
    else:
        failed.append(t)
        # попробуем быстрый fallback (пока только уведомление)
        if t.endswith(".ME"):
            try_moex_fallback(t)

# итог
print("\n----- Завершено -----")
print("Успешно: ", [t for t in TICKERS if os.path.exists(os.path.join(OUTPUT_DIR, f'{t.replace('/','_')}.csv'))])
if failed:
    print("Не удалось загрузить:", failed)
else:
    print("Все тикеры успешно загружены.")
=== Скачиваем SBER.ME ===
Файл market_data\SBER.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем VTBR.ME ===
Файл market_data\VTBR.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем TCSG.ME ===
Файл market_data\TCSG.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем GAZP.ME ===
Файл market_data\GAZP.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем LKOH.ME ===
Файл market_data\LKOH.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем ROSN.ME ===
Файл market_data\ROSN.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем NLMK.ME ===
Файл market_data\NLMK.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем GMKN.ME ===
Файл market_data\GMKN.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем CHMF.ME ===
Файл market_data\CHMF.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем AAPL ===
Файл market_data\AAPL.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем MSFT ===
Файл market_data\MSFT.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем NVDA ===
Файл market_data\NVDA.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем WMT ===
Файл market_data\WMT.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем COST ===
Файл market_data\COST.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем TGT ===
Файл market_data\TGT.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем XOM ===
Файл market_data\XOM.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем CVX ===
Файл market_data\CVX.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

=== Скачиваем COP ===
Файл market_data\COP.csv уже есть — пропускаю (удалите файл, чтобы перекачать).

----- Завершено -----
Успешно:  ['SBER.ME', 'VTBR.ME', 'TCSG.ME', 'GAZP.ME', 'LKOH.ME', 'ROSN.ME', 'NLMK.ME', 'GMKN.ME', 'CHMF.ME', 'AAPL', 'MSFT', 'NVDA', 'WMT', 'COST', 'TGT', 'XOM', 'CVX', 'COP']
Все тикеры успешно загружены.

Теперь, для каждого сектора сформируем отдельный датасет для упрощения анализа и далее будем работать с датафреймами.

Для удобства реализуем функцию очистки, чтобы привести все данные в единый датафрейм.

In [29]:
# Функция для очистки и формирования итоговых датафреймов для работы
def load_and_clean_csv(path, ticker_name=None):
    """
    Загружает CSV, удаляет мусорные строки, повторные хедеры,
    не-даты, не-числовые строки и возвращает Series цен.
    """

    df = pd.read_csv(path)

    # 1. Удаляем строки, где все значения строковые (например повторный header)
    df = df[~df.apply(lambda r: r.astype(str).str.contains("Open|Close|High|Low|Adj|Volume").any(), axis=1)]

    # 2. Пробуем определить колонку даты
    # типично она первая
    if "Date" in df.columns:
        df["Date"] = pd.to_datetime(df["Date"], errors="coerce")
    else:
        # fallback: первая колонка
        df.iloc[:,0] = pd.to_datetime(df.iloc[:,0], errors="coerce")
        df = df.rename(columns={df.columns[0]: "Date"})

    # 3. Убираем строки, где дата не распарсилась
    df = df[df["Date"].notna()]
    df = df.set_index("Date")
    df = df.sort_index()

    # 4. Ищем колонку цены
    price_col = None
    preferred = ["Adj Close", "Close", "close", "adjclose"]

    for col in preferred:
        if col in df.columns:
            price_col = col
            break

    if price_col is None:
        # fallback: первая числовая колонка
        numeric_cols = df.select_dtypes(include=[np.number]).columns
        if len(numeric_cols)==0:
            raise ValueError(f"{ticker_name}: не найдено числовых колонок после очистки")
        price_col = numeric_cols[0]

    # 5. Приводим цену к float
    s = pd.to_numeric(df[price_col], errors="coerce")

    # 6. Убираем NaN строки
    s = s.dropna()

    s.name = ticker_name if ticker_name else price_col
    return s
In [30]:
import os

# Директория для взятия загруженных данных с yfinance
DATA_DIR = "market_data"

# Загружаем все тикеры из CSV → dict[ticker] = Series
# Используем ранее написанную функцию для очистки и формирования датафреймов
dfs = {}
for f in os.listdir(DATA_DIR):
    if f.endswith(".csv"):
        ticker = f.replace(".csv", "")
        path = os.path.join(DATA_DIR, f)

        try:
            s = load_and_clean_csv(path, ticker_name=ticker)
            dfs[ticker] = s
        except Exception as e:
            print(f"❌ Ошибка при загрузке {ticker}: {e}")

# Группы
rus_fin = ["SBER.ME","VTBR.ME","TCSG.ME"]
rus_oil = ["GAZP.ME","LKOH.ME","ROSN.ME"]
rus_met = ["NLMK.ME","GMKN.ME","CHMF.ME"]

us_tech = ["AAPL","MSFT","NVDA"]
us_retail = ["WMT","COST","TGT"]
us_energy = ["XOM","CVX","COP"]

groups = {
    "rus_fin": rus_fin,
    "rus_oil": rus_oil,
    "rus_met": rus_met,
    "us_tech": us_tech,
    "us_retail": us_retail,
    "us_energy": us_energy
}

# Функция для формирования датафрейма сектора
def build_sector_df(tickers):
    df = pd.concat([dfs[t] for t in tickers], axis=1, join="inner")
    df.columns = tickers
    df = df.dropna()
    return df

sector_data = {name: build_sector_df(ticks) for name, ticks in groups.items()}

# Сохраняем
for name, df in sector_data.items():
    df.to_csv(f"{name}_prices.csv")
    print(name, df.shape, df.index.min(), df.index.max())
rus_fin (626, 3) 2019-10-29 00:00:00 2022-05-24 00:00:00
rus_oil (1616, 3) 2015-11-20 00:00:00 2022-05-24 00:00:00
rus_met (1616, 3) 2015-11-20 00:00:00 2022-05-24 00:00:00
us_tech (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00
us_retail (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00
us_energy (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00

Отлично, теперь можно приступить к EDA.

EDA + тесты на стационарность¶

Для упрощения, реализуем функции, которые будут отвечать за тесты на стационарность (Дики-Фуллера и KPSS), корреляционные матрицы, heatmapы и т.д. и т.п.

Также сделаем лог-преобразования для устойчивости векторных моделей в дальнейшем...

In [31]:
import seaborn as sns
import scipy.stats as st
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

SECTORS = {
    "rus_fin": "rus_fin_prices.csv",
    "rus_oil": "rus_oil_prices.csv",
    "rus_met": "rus_met_prices.csv",
    "us_tech": "us_tech_prices.csv",
    "us_retail": "us_retail_prices.csv",
    "us_energy": "us_energy_prices.csv"
}

def load_sector(path):
    df = pd.read_csv(path, index_col=0, parse_dates=True)
    df = df.sort_index()
    return df

def adf_test(x):
    res = adfuller(x, autolag="AIC")
    return res[1]  # p-value

def kpss_test(x):
    try:
        stat, p, l, crit = kpss(x, regression="c")
        return p
    except:
        return np.nan
    
# Посмотрим на график квантилей для оценки нормальности распределений, поскольку VAR предполагает нормальность остатков
def qq_plots(df_returns, sector_name):
    plt.figure(figsize=(14,4))
    for i, col in enumerate(df_returns.columns):
        plt.subplot(1, len(df_returns.columns), i+1)
        st.probplot(df_returns[col], dist="norm", plot=plt)
        plt.title(col)
    plt.suptitle(f"QQ-plots ({sector_name})")
    plt.show()

# Графики автокорреляционных функций позволят сделать выводы о подходящей модели и ее параметрах 
def acf_pacf(df_returns, sector_name):
    for col in df_returns.columns:
        fig, ax = plt.subplots(1,2, figsize=(12,4))
        plot_acf(df_returns[col], ax=ax[0])
        plot_pacf(df_returns[col], ax=ax[1])
        fig.suptitle(f"{sector_name} — {col}")
        plt.show()

# Посмотрим на скользящее отклонение (дисперсию), возьмем для месяца, чтобы понять как сильно различаются цены по месяцам для акций
def rolling_vol(df_returns, sector_name, window=30):
    vol = df_returns.rolling(window).std() * np.sqrt(252)
    vol.plot(figsize=(12,5))
    plt.title(f"{sector_name} — 30-day annualized volatility")
    plt.grid(True)
    plt.show()

# Проверим перекрестные корреляции, то есть как взаимосвязаны лаги между собой, чтобы уловить полезные зависимости, которые могут быть полезны для VAR
def lagged_cross_corr(df_returns, sector_name, lag=1):
    df_lag = df_returns.shift(lag)
    corr = df_returns.corrwith(df_lag)
    print(f"\nLagged correlation (lag={lag}) for {sector_name}:")
    display(corr)


# Стало скучно, нарисуем кластер зависимостей)))
def cluster_market_regimes(df_returns, sector_name, k=3):
    scaler = StandardScaler()
    X = scaler.fit_transform(df_returns)

    km = KMeans(n_clusters=k, random_state=42)
    labels = km.fit_predict(X)

    plt.figure(figsize=(14,4))
    plt.plot(df_returns.index, labels, label="Regime")
    plt.title(f"{sector_name} — Market Regimes (k-means)")
    plt.grid(True)
    plt.show()


# ======================================================
#   EDA Function (plots + stats + tests + conclusions)
# ======================================================

def run_eda(name, df_prices):
    print("="*80)
    print(f"📌 EDA СЕКТОР: {name.upper()}")
    print("="*80)

    print("\nРазмер данных:", df_prices.shape)
    print("Диапазон дат:", df_prices.index.min(), "→", df_prices.index.max())
    display(df_prices.head())

    # --------------------------- Prices plot
    plt.figure(figsize=(14,5))
    plt.plot(df_prices)
    plt.title(f"Цены ({name})")
    plt.grid(True)
    plt.legend(df_prices.columns)
    plt.show()

    # --------------------------- Log Returns
    df_returns = np.log(df_prices).diff().dropna()

    plt.figure(figsize=(14,5))
    plt.plot(df_returns)
    plt.title(f"Лог-доходности ({name})")
    plt.grid(True)
    plt.legend(df_returns.columns)
    plt.show()

    # --------------------------- Distribution
    df_returns.hist(bins=50, figsize=(12,6))
    plt.suptitle(f"Гистограммы доходностей — {name}")
    plt.show()

    # --------------------------- Boxplot
    plt.figure(figsize=(10,5))
    sns.boxplot(data=df_returns)
    plt.title(f"Boxplot доходностей ({name})")
    plt.show()

    # --------------------------- Correlation
    plt.figure(figsize=(6,5))
    sns.heatmap(df_returns.corr(), annot=True, cmap="coolwarm", vmin=-1, vmax=1)
    plt.title(f"Корреляционная матрица доходностей ({name})")
    plt.show()

    # --------------------------- ADF & KPSS
    print("\n📊 Тесты стационарности (УРОВНИ):")
    adf_levels = {col: adf_test(df_prices[col]) for col in df_prices.columns}
    kpss_levels = {col: kpss_test(df_prices[col]) for col in df_prices.columns}
    display(pd.DataFrame({"ADF p-value": adf_levels, "KPSS p-value": kpss_levels}))

    # --------------------------- ACF @ PACF
    acf_pacf(df_prices, name)

    # --------------------------- Q-Q plot
    qq_plots(df_prices, name)

    # --------------------------- Rolling volatility
    rolling_vol(df_prices, name)

    # --------------------------- Lagged cross correlation
    lagged_cross_corr(df_prices, name)

    # --------------------------- Clustering analysis
    cluster_market_regimes(df_prices, name)

    print("\n📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):")
    adf_ret = {col: adf_test(df_returns[col]) for col in df_returns.columns}
    kpss_ret = {col: kpss_test(df_returns[col]) for col in df_returns.columns}
    display(pd.DataFrame({"ADF p-value": adf_ret, "KPSS p-value": kpss_ret}))



    # --------------------------- Conclusions
    print("\n📌 ВЫВОДЫ ДЛЯ СЕКТОРА", name.upper(), "\n")

    # Уровни
    print("• Уровни цен: как правило НЕ стационарны.")
    print("  - ADF p-value > 0.05 → не отвергаем единичный корень.")
    print("  - KPSS p-value < 0.05 → ряд НЕ стационарен.")
    print("  Это НОРМА для финансовых цен → VECM на уровнях возможен.")

    # Доходности
    print("• Лог-доходности: обычно стационарны.")
    print("  - ADF p < 0.05 → стационарны.")
    print("  - KPSS p > 0.05 → стационарность подтверждается.")
    print("  Это делает доходности корректным выбором для VAR/VARMAX.")

    # Корреляции
    corr = df_returns.corr()
    avg_corr = corr.values[np.triu_indices_from(corr.values, k=1)].mean()
    print(f"• Средняя корреляция внутри сектора: {avg_corr:.3f}")
    if avg_corr > 0.6:
        print("  ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.")
    elif avg_corr < 0.2:
        print("  ➝ Активы слабо связаны, VAR может быть слабым.")
    else:
        print("  ➝ Умеренная связь, модели должны выявить перекрестные эффекты.")

    print("\nГотово.\n\n")

    return df_returns
In [32]:
# =====================
all_returns = {}

for sector, file in SECTORS.items():
    df = load_sector(file)
    df_ret = run_eda(sector, df)
    all_returns[sector] = df_ret

print("\n\n🎉 FULL EDA COMPLETED FOR ALL SECTORS")
================================================================================
📌 EDA СЕКТОР: RUS_FIN
================================================================================

Размер данных: (626, 3)
Диапазон дат: 2019-10-29 00:00:00 → 2022-05-24 00:00:00
SBER.ME VTBR.ME TCSG.ME
Date
2019-10-29 207.504654 691.412476 1219.064453
2019-10-30 206.986130 698.477478 1216.265869
2019-10-31 202.993500 691.894165 1215.266479
2019-11-01 204.298462 692.857605 1213.267700
2019-11-05 206.139221 706.987732 1197.277466
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
SBER.ME 0.873321 0.01
VTBR.ME 0.598282 0.01
TCSG.ME 0.702201 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for rus_fin:
SBER.ME    0.994662
VTBR.ME    0.997195
TCSG.ME    0.997232
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
SBER.ME 6.279155e-15 0.100000
VTBR.ME 4.694274e-14 0.100000
TCSG.ME 1.131758e-29 0.053674
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_FIN 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.764
  ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.

Готово.


================================================================================
📌 EDA СЕКТОР: RUS_OIL
================================================================================

Размер данных: (1616, 3)
Диапазон дат: 2015-11-20 00:00:00 → 2022-05-24 00:00:00
GAZP.ME LKOH.ME ROSN.ME
Date
2015-11-20 101.554611 1584.009644 209.611526
2015-11-23 104.264091 1654.354858 211.070862
2015-11-24 99.908340 1607.021851 204.119675
2015-11-25 98.769669 1625.730225 208.574554
2015-11-26 98.227753 1620.554443 208.036926
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
GAZP.ME 0.719025 0.01
LKOH.ME 0.395897 0.01
ROSN.ME 0.364743 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for rus_oil:
GAZP.ME    0.998291
LKOH.ME    0.997971
ROSN.ME    0.997103
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
GAZP.ME 7.368139e-30 0.1
LKOH.ME 7.547072e-10 0.1
ROSN.ME 1.210841e-27 0.1
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_OIL 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.719
  ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.

Готово.


================================================================================
📌 EDA СЕКТОР: RUS_MET
================================================================================

Размер данных: (1616, 3)
Диапазон дат: 2015-11-20 00:00:00 → 2022-05-24 00:00:00
NLMK.ME GMKN.ME CHMF.ME
Date
2015-11-20 33.449692 9591.611328 299.263031
2015-11-23 33.691441 9434.567383 298.410431
2015-11-24 32.682610 9287.459961 289.234833
2015-11-25 32.961552 9369.958984 290.371674
2015-11-26 33.166107 9333.181641 293.132446
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
NLMK.ME 0.662816 0.01
GMKN.ME 0.776743 0.01
CHMF.ME 0.834555 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for rus_met:
NLMK.ME    0.998954
GMKN.ME    0.998503
CHMF.ME    0.998702
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
NLMK.ME 2.130207e-26 0.1
GMKN.ME 0.000000e+00 0.1
CHMF.ME 1.231408e-16 0.1
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_MET 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.511
  ➝ Умеренная связь, модели должны выявить перекрестные эффекты.

Готово.


================================================================================
📌 EDA СЕКТОР: US_TECH
================================================================================

Размер данных: (2514, 3)
Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
AAPL MSFT NVDA
Date
2015-11-23 26.548964 47.538147 0.754261
2015-11-24 26.803743 47.590771 0.760359
2015-11-25 26.612099 47.099518 0.759383
2015-11-27 26.562492 47.310066 0.765725
2015-11-30 26.672977 47.678497 0.773776
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
AAPL 0.982796 0.01
MSFT 0.967317 0.01
NVDA 0.998576 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for us_tech:
AAPL    0.999472
MSFT    0.999580
NVDA    0.999343
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
AAPL 5.121544e-29 0.1
MSFT 7.340334e-30 0.1
NVDA 2.902646e-30 0.1
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_TECH 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.617
  ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.

Готово.


================================================================================
📌 EDA СЕКТОР: US_RETAIL
================================================================================

Размер данных: (2514, 3)
Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
WMT COST TGT
Date
2015-11-23 16.636299 137.534073 53.716831
2015-11-24 16.542435 136.473801 54.170162
2015-11-25 16.630775 136.642120 54.370827
2015-11-27 16.534153 137.643402 54.578926
2015-11-30 16.244270 135.825851 53.880329
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
WMT 0.998771 0.01
COST 0.983306 0.01
TGT 0.558303 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for us_retail:
WMT     0.999530
COST    0.999683
TGT     0.998726
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
WMT 6.741837e-30 0.1
COST 2.555191e-19 0.1
TGT 0.000000e+00 0.1
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_RETAIL 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.463
  ➝ Умеренная связь, модели должны выявить перекрестные эффекты.

Готово.


================================================================================
📌 EDA СЕКТОР: US_ENERGY
================================================================================

Размер данных: (2514, 3)
Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
XOM CVX COP
Date
2015-11-23 51.805092 58.679672 39.201073
2015-11-24 52.837589 59.553265 40.702011
2015-11-25 52.431053 59.240341 40.010406
2015-11-27 52.418140 58.914387 39.348213
2015-11-30 52.695629 59.533695 39.767601
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
📊 Тесты стационарности (УРОВНИ):
ADF p-value KPSS p-value
XOM 0.895126 0.01
CVX 0.612451 0.01
COP 0.655668 0.01
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Lagged correlation (lag=1) for us_energy:
XOM    0.998986
CVX    0.998538
COP    0.998804
dtype: float64
No description has been provided for this image
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
ADF p-value KPSS p-value
XOM 2.754644e-30 0.1
CVX 3.472096e-27 0.1
COP 1.632428e-18 0.1
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_ENERGY 

• Уровни цен: как правило НЕ стационарны.
  - ADF p-value > 0.05 → не отвергаем единичный корень.
  - KPSS p-value < 0.05 → ряд НЕ стационарен.
  Это НОРМА для финансовых цен → VECM на уровнях возможен.
• Лог-доходности: обычно стационарны.
  - ADF p < 0.05 → стационарны.
  - KPSS p > 0.05 → стационарность подтверждается.
  Это делает доходности корректным выбором для VAR/VARMAX.
• Средняя корреляция внутри сектора: 0.813
  ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.

Готово.




🎉 FULL EDA COMPLETED FOR ALL SECTORS

Запустим VAR¶

In [33]:
import os

# Создадим папки для хранения результатов
OUT_VAR = "VAR_results"
OUT_VECM = "VECM_results"
OUT_VARMAX = "VARMAX_results"
os.makedirs(OUT_VAR, exist_ok=True)
os.makedirs(OUT_VECM, exist_ok=True)
os.makedirs(OUT_VARMAX, exist_ok=True)
In [34]:
# VAR_pipeline_all_sectors.py
import os
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tools.eval_measures import rmse
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.stattools import durbin_watson

# ==========================
# Settings
# ==========================
SECTORS = {
    "rus_fin": "rus_fin_prices.csv",
    "rus_oil": "rus_oil_prices.csv",
    "rus_met": "rus_met_prices.csv",
    "us_tech": "us_tech_prices.csv",
    "us_retail": "us_retail_prices.csv",
    "us_energy": "us_energy_prices.csv"
}

OUT_DIR = "VAR_results"
os.makedirs(OUT_DIR, exist_ok=True)

# Forecast horizons to produce
FORECAST_HORIZONS = [10, 20, 100]

# How many last observations to use as holdout for forecast evaluation
HOLDOUT_DAYS = 100  # можно поменять

# Max lags to consider in select_order
MAX_LAGS = 10

# ==========================
# Utility functions
# ==========================
def ensure_df_numeric(df):
    """Ensure all columns numeric; coerce non-numeric to NaN then drop cols with all NaN."""
    df2 = df.copy()
    for c in df2.columns:
        df2[c] = pd.to_numeric(df2[c], errors="coerce")
    df2 = df2.dropna(axis=1, how='all')
    return df2

def compute_log_returns(df):
    # natural log returns
    return np.log(df).diff().dropna(how='all')

def savefig(fig, path):
    fig.savefig(path, bbox_inches='tight', dpi=150)
    plt.close(fig)

def summarize_var_results(var_res):
    """Return dict summary of diagnostics we will store."""
    res = {}
    # stability: companion roots
    try:
        roots = var_res.roots
        res['stable'] = np.all(np.abs(roots) < 1)
        res['max_root_modulus'] = np.max(np.abs(roots))
    except Exception:
        res['stable'] = np.nan
        res['max_root_modulus'] = np.nan

    # serial correlation
    try:
        serial_test = var_res.test_serial_correlation(lags=var_res.k_ar)
        res['serial_pvalue'] = serial_test.pvalue if hasattr(serial_test, 'pvalue') else np.nan
    except Exception:
        res['serial_pvalue'] = np.nan

    # normality
    try:
        norm_test = var_res.test_normality()
        res['normality_pvalue'] = norm_test['normality'].pvalue if hasattr(norm_test, 'normality') else np.nan
    except Exception:
        res['normality_pvalue'] = np.nan

    # arch
    try:
        arch_test = var_res.test_arch()
        res['arch_pvalue'] = arch_test.pvalue if hasattr(arch_test, 'pvalue') else np.nan
    except Exception:
        res['arch_pvalue'] = np.nan

    # durbin-watson for residuals per equation (average)
    try:
        dw = durbin_watson(var_res.resid)
        res['dw_mean'] = np.mean(dw)
    except Exception:
        res['dw_mean'] = np.nan

    return res

def forecast_and_reconstruct_prices(var_res, df_prices, df_returns_train, steps):
    """
    Forecast returns for 'steps' ahead using var_res (fitted on returns),
    then reconstruct levels (prices) by cumulatively applying returns to last known prices.
    df_prices: DataFrame of prices (aligned with returns)
    df_returns_train: returns DataFrame that was used to fit var_res
    """
    last_price = df_prices.iloc[-1]  # last observed prices
    # forecast returns
    fc_returns = var_res.forecast(y=df_returns_train.values[-var_res.k_ar:], steps=steps)
    start = df_prices.index[-1] + pd.Timedelta(days=1)
    idx = pd.date_range(start=start, periods=h, freq='B')
    fc_returns_df = pd.DataFrame(fc_returns, index=idx, columns=df_returns_train.columns)

    # reconstruct price paths: price_t+1 = price_t * exp(return_t+1)
    prices_fc = []
    prev = last_price.copy()
    for t in range(steps):
        ret = fc_returns_df.iloc[t]
        next_price = prev * np.exp(ret)
        prices_fc.append(next_price)
        prev = next_price
    prices_fc_df = pd.DataFrame(prices_fc, index=idx, columns=df_returns_train.columns)
    return fc_returns_df, prices_fc_df
In [35]:
# ==========================
# Main pipeline per sector
# ==========================
summary_rows = []

for sector_name, filename in SECTORS.items():
    print("\n" + "="*80)
    print(f"Processing sector: {sector_name}")
    print("="*80)

    try:
        df_prices = pd.read_csv(filename, index_col=0, parse_dates=True)
    except Exception as e:
        print(f"ERROR: cannot read {filename}: {e}")
        continue

    # Clean numeric & drop NaN cols if any
    df_prices = ensure_df_numeric(df_prices)
    print("Loaded prices shape:", df_prices.shape)

    # If fewer than 2 columns -> skip
    if df_prices.shape[1] < 2:
        print(f"Not enough series in {sector_name} after cleaning. Skipping.")
        continue

    # compute returns
    df_returns = compute_log_returns(df_prices).dropna(how='any')
    print("Returns shape (after dropna):", df_returns.shape)
    if df_returns.shape[0] < 50:
        print("Too few observations for reliable VAR. Consider expanding period. Skipping.")
        continue

    # train / holdout split: leave HOLDOUT_DAYS last obs for evaluation
    if df_returns.shape[0] > HOLDOUT_DAYS:
        train = df_returns.iloc[:-HOLDOUT_DAYS]
        test = df_returns.iloc[-HOLDOUT_DAYS:]
    else:
        train = df_returns.copy()
        test = pd.DataFrame()  # empty

    # Lag order selection
    model = VAR(train)
    try:
        sel = model.select_order(MAX_LAGS)
    except Exception as e:
        print("select_order failed:", e)
        sel = None

    # choose lag by AIC (fallback to BIC if None)
    if sel is not None:
        try:
            best_aic = sel.selected_orders.get('aic', None)
            best_bic = sel.selected_orders.get('bic', None)
            best_hqic = sel.selected_orders.get('hqic', None)
        except Exception:
            best_aic = sel.aic
            best_bic = sel.bic
            best_hqic = sel.hqic
    else:
        best_aic = best_bic = best_hqic = None

    # choose lag to fit: prefer AIC, but ensure >=1
    chosen_lag = int(best_aic) if (best_aic is not None and not pd.isna(best_aic) and best_aic>=1) else (int(best_bic) if best_bic is not None and best_bic>=1 else 1)
    chosen_lag = max(1, chosen_lag)
    print(f"Selected lags -> AIC: {best_aic}, BIC: {best_bic}, chosen: {chosen_lag}")

    # Fit VAR
    try:
        var_res = model.fit(chosen_lag)
    except Exception as e:
        print("VAR fit failed:", e)
        continue

    # Save summary text
    txt_out = os.path.join(OUT_DIR, f"{sector_name}_VAR_summary.txt")
    with open(txt_out, "w", encoding="utf-8") as f:
        # f.write(var_res.summary().as_text())
        var_summary = var_res.summary()

        try:
            # старый способ (если доступен)
            text = var_summary.as_text()
        except:
            try:
                # если объект содержит таблицы (обычно tables[0], tables[1])
                text = "\n\n".join([t.as_text() for t in var_summary.tables])
            except:
                # последний надёжный вариант
                text = str(var_summary)

        f.write(text)

    # Diagnostics
    diagnostics = summarize_var_results(var_res)
    diagnostics['sector'] = sector_name
    diagnostics['n_obs_train'] = train.shape[0]
    diagnostics['n_vars'] = train.shape[1]
    diagnostics['chosen_lag'] = chosen_lag

    # Save coefficients table
    coefs = var_res.params
    coefs.to_csv(os.path.join(OUT_DIR, f"{sector_name}_VAR_coeffs.csv"))

    # IRF
    try:
        irf = var_res.irf(20)
        fig_irf = irf.plot(orth=False)
        irf_png = os.path.join(OUT_DIR, f"{sector_name}_IRF.png")
        # statsmodels returns a matplotlib Figure if plot called; handle variant
        try:
            savefig(fig_irf, irf_png)
        except Exception:
            plt.savefig(irf_png, bbox_inches='tight', dpi=150)
            plt.close()
        print("IRF saved:", irf_png)
    except Exception as e:
        print("IRF failed:", e)

    # FEVD
    try:
        fevd = var_res.fevd(20)
        # plot fevd
        fig = fevd.plot()
        fevd_png = os.path.join(OUT_DIR, f"{sector_name}_FEVD.png")
        try:
            savefig(fig, fevd_png)
        except Exception:
            plt.savefig(fevd_png, bbox_inches='tight', dpi=150)
            plt.close()
        print("FEVD saved:", fevd_png)
    except Exception as e:
        print("FEVD failed:", e)

    # Forecasts for predefined horizons
    for h in FORECAST_HORIZONS:
        try:
            fc_ret_df, fc_price_df = forecast_and_reconstruct_prices(var_res, df_prices.loc[train.index.union(test.index)], train, h)
            fc_ret_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_forecast_returns_h{h}.csv"))
            fc_price_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_forecast_prices_h{h}.csv"))
            print(f"Forecasts (h={h}) saved for {sector_name}.")
        except Exception as e:
            print(f"Forecast failed for h={h}:", e)

    # Evaluate one-step-ahead rolling forecast on holdout if available
    eval_metrics = {}
    if not test.empty:
        try:
            # one-step rolling: refit on expanding window
            preds = []
            true = []
            train_copy = train.copy()
            for t in range(len(test)):
                m = VAR(train_copy)
                res_t = m.fit(chosen_lag)
                # forecast 1-step
                yhat = res_t.forecast(y=train_copy.values[-res_t.k_ar:], steps=1)[0]
                preds.append(yhat)
                true.append(test.values[t])
                # append true to train_copy for next step
                new_row = pd.DataFrame([test.values[t]], index=[test.index[t]], columns=train_copy.columns)
                train_copy = pd.concat([train_copy, new_row])
            preds = np.vstack(preds)
            true = np.vstack(true)
            # compute RMSE and MAE per series
            rmses = np.sqrt(np.mean((preds - true)**2, axis=0))
            maes = np.mean(np.abs(preds - true), axis=0)
            eval_df = pd.DataFrame({"ticker": train.columns, "RMSE": rmses, "MAE": maes})
            eval_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_rolling_eval.csv"), index=False)
            eval_metrics['rolling_RMSE_mean'] = eval_df["RMSE"].mean()
            eval_metrics['rolling_MAE_mean'] = eval_df["MAE"].mean()
            print("Rolling one-step evaluation saved.")
        except Exception as e:
            print("Rolling forecast evaluation failed:", e)

    # Diagnostics tests saved
    diag_df = pd.DataFrame.from_dict({sector_name: diagnostics}, orient='index')
    diag_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_diagnostics.csv"))

    # Save residuals diagnostics plots (ACF of residuals, histogram)
    try:
        resid = var_res.resid
        # residuals ACF for each eqn
        for col in resid.columns:
            fig, ax = plt.subplots(2,1, figsize=(8,6))
            sns.histplot(resid[col].dropna(), kde=True, ax=ax[0])
            ax[0].set_title(f"{sector_name} residuals histogram - {col}")
            from statsmodels.graphics.tsaplots import plot_acf
            plot_acf(resid[col].dropna(), ax=ax[1], lags=20)
            savefig(fig, os.path.join(OUT_DIR, f"{sector_name}_resid_{col}_hist_acf.png"))
        print("Residual diagnostic plots saved.")
    except Exception as e:
        print("Residual diagnostics failed:", e)

    # Append summary row
    row = {
        "sector": sector_name,
        "n_obs": df_returns.shape[0],
        "n_vars": df_returns.shape[1],
        "chosen_lag": chosen_lag,
        "stable": diagnostics.get('stable', np.nan),
        "max_root_modulus": diagnostics.get('max_root_modulus', np.nan),
        "serial_pvalue": diagnostics.get('serial_pvalue', np.nan),
        "normality_pvalue": diagnostics.get('normality_pvalue', np.nan),
        "arch_pvalue": diagnostics.get('arch_pvalue', np.nan),
        "dw_mean": diagnostics.get('dw_mean', np.nan),
        "rolling_RMSE_mean": eval_metrics.get('rolling_RMSE_mean', np.nan)
    }
    summary_rows.append(row)

# Save overall summary
summary_df = pd.DataFrame(summary_rows)
summary_df.to_csv(os.path.join(OUT_DIR, "VAR_overview_summary.csv"), index=False)
print("\nALL DONE. Results saved in folder:", OUT_DIR)
================================================================================
Processing sector: rus_fin
================================================================================
Loaded prices shape: (626, 3)
Returns shape (after dropna): (428, 3)
Selected lags -> AIC: 7, BIC: 0, chosen: 7
IRF saved: VAR_results\rus_fin_IRF.png
FEVD saved: VAR_results\rus_fin_FEVD.png
Forecasts (h=10) saved for rus_fin.
Forecasts (h=20) saved for rus_fin.
Forecasts (h=100) saved for rus_fin.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

================================================================================
Processing sector: rus_oil
================================================================================
Loaded prices shape: (1616, 3)
Returns shape (after dropna): (1615, 3)
Selected lags -> AIC: 2, BIC: 0, chosen: 2
IRF saved: VAR_results\rus_oil_IRF.png
FEVD saved: VAR_results\rus_oil_FEVD.png
Forecasts (h=10) saved for rus_oil.
Forecasts (h=20) saved for rus_oil.
Forecasts (h=100) saved for rus_oil.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

================================================================================
Processing sector: rus_met
================================================================================
Loaded prices shape: (1616, 3)
Returns shape (after dropna): (1615, 3)
Selected lags -> AIC: 1, BIC: 0, chosen: 1
IRF saved: VAR_results\rus_met_IRF.png
FEVD saved: VAR_results\rus_met_FEVD.png
Forecasts (h=10) saved for rus_met.
Forecasts (h=20) saved for rus_met.
Forecasts (h=100) saved for rus_met.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

================================================================================
Processing sector: us_tech
================================================================================
Loaded prices shape: (2514, 3)
Returns shape (after dropna): (2513, 3)
Selected lags -> AIC: 9, BIC: 1, chosen: 9
IRF saved: VAR_results\us_tech_IRF.png
FEVD saved: VAR_results\us_tech_FEVD.png
Forecasts (h=10) saved for us_tech.
Forecasts (h=20) saved for us_tech.
Forecasts (h=100) saved for us_tech.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

================================================================================
Processing sector: us_retail
================================================================================
Loaded prices shape: (2514, 3)
Returns shape (after dropna): (2513, 3)
Selected lags -> AIC: 1, BIC: 0, chosen: 1
IRF saved: VAR_results\us_retail_IRF.png
FEVD saved: VAR_results\us_retail_FEVD.png
Forecasts (h=10) saved for us_retail.
Forecasts (h=20) saved for us_retail.
Forecasts (h=100) saved for us_retail.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

================================================================================
Processing sector: us_energy
================================================================================
Loaded prices shape: (2514, 3)
Returns shape (after dropna): (2513, 3)
Selected lags -> AIC: 7, BIC: 0, chosen: 7
IRF saved: VAR_results\us_energy_IRF.png
FEVD saved: VAR_results\us_energy_FEVD.png
Forecasts (h=10) saved for us_energy.
Forecasts (h=20) saved for us_energy.
Forecasts (h=100) saved for us_energy.
Rolling one-step evaluation saved.
Residual diagnostic plots saved.

ALL DONE. Results saved in folder: VAR_results

Визулизируем получившиеся результаты

In [36]:
from IPython.display import Image, display

RESULTS_DIR = "VAR_results"

def show_pngs(dir):
    print("\n===== PNG визуализации =====\n")
    for file in sorted(os.listdir(dir)):
        if file.lower().endswith(".png"):
            path = os.path.join(dir, file)
            print(f"\n📌 Showing: {file}")
            display(Image(filename=path))

def show_csvs(dir, n_rows=10):
    print("\n===== CSV таблицы =====\n")
    for file in sorted(os.listdir(dir)):
        if file.lower().endswith(".csv"):
            path = os.path.join(dir, file)
            print(f"\n📄 CSV: {file}")
            try:
                df = pd.read_csv(path)
                display(df.head(n_rows))
            except Exception as e:
                print("⚠ Error reading CSV:", e)

def show_txts(dir):
    print("\n===== TXT summary =====\n")
    for file in sorted(os.listdir(dir)):
        if file.lower().endswith(".txt"):
            path = os.path.join(dir, file)
            print(f"\n📜 TXT: {file}")
            try:
                with open(path, "r", encoding="utf-8") as f:
                    print(f.read()[:2000])  # первые 2000 символов
            except Exception as e:
                print("⚠ Error reading TXT:", e)

print("Просмотр PNG/CSV/TXT из директории VAR_results")
show_csvs(RESULTS_DIR)
show_txts(RESULTS_DIR)
show_pngs(RESULTS_DIR)
Просмотр PNG/CSV/TXT из директории VAR_results

===== CSV таблицы =====


📄 CSV: VAR_overview_summary.csv
sector n_obs n_vars chosen_lag stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean rolling_RMSE_mean
0 rus_fin 428 3 7 False 4.975734 NaN NaN NaN 1.977567 0.090406
1 rus_oil 1615 3 2 False 24.717950 NaN NaN NaN 1.995539 0.061666
2 rus_met 1615 3 1 False 21.479943 NaN NaN NaN 1.999110 0.038605
3 us_tech 2513 3 9 False 1.730068 NaN NaN NaN 1.996385 0.015975
4 us_retail 2513 3 1 False 33.784002 NaN NaN NaN 1.999501 0.013985
5 us_energy 2513 3 7 False 3.077592 NaN NaN NaN 1.994895 0.013489
📄 CSV: rus_fin_VAR_coeffs.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 const -0.000812 -0.001614 0.001677
1 L1.SBER.ME -0.067880 0.000800 0.026687
2 L1.VTBR.ME 0.050555 0.149044 0.107342
3 L1.TCSG.ME 0.007783 0.026148 -0.050594
4 L2.SBER.ME -0.008981 0.100173 0.000664
5 L2.VTBR.ME -0.066138 -0.028755 -0.047856
6 L2.TCSG.ME 0.018146 -0.020642 -0.054481
7 L3.SBER.ME -0.172843 -0.043765 -0.356390
8 L3.VTBR.ME -0.015094 -0.074849 0.217579
9 L3.TCSG.ME 0.177597 0.196131 0.288098
📄 CSV: rus_fin_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 rus_fin False 4.975734 NaN NaN NaN 1.977567 rus_fin 328 3 7
📄 CSV: rus_fin_forecast_prices_h10.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 122.936519 0.018983 2176.249157
1 2022-05-26 123.396624 0.018904 2154.390228
2 2022-05-27 124.723073 0.019065 2187.639044
3 2022-05-30 125.121019 0.019122 2217.708226
4 2022-05-31 122.456150 0.018862 2176.851007
5 2022-06-01 123.934430 0.018990 2190.451168
6 2022-06-02 124.534915 0.019045 2218.257177
7 2022-06-03 123.605252 0.018940 2216.201904
8 2022-06-06 123.132373 0.018857 2205.076383
9 2022-06-07 123.611152 0.018882 2209.227650
📄 CSV: rus_fin_forecast_prices_h100.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 122.936519 0.018983 2176.249157
1 2022-05-26 123.396624 0.018904 2154.390228
2 2022-05-27 124.723073 0.019065 2187.639044
3 2022-05-30 125.121019 0.019122 2217.708226
4 2022-05-31 122.456150 0.018862 2176.851007
5 2022-06-01 123.934430 0.018990 2190.451168
6 2022-06-02 124.534915 0.019045 2218.257177
7 2022-06-03 123.605252 0.018940 2216.201904
8 2022-06-06 123.132373 0.018857 2205.076383
9 2022-06-07 123.611152 0.018882 2209.227650
📄 CSV: rus_fin_forecast_prices_h20.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 122.936519 0.018983 2176.249157
1 2022-05-26 123.396624 0.018904 2154.390228
2 2022-05-27 124.723073 0.019065 2187.639044
3 2022-05-30 125.121019 0.019122 2217.708226
4 2022-05-31 122.456150 0.018862 2176.851007
5 2022-06-01 123.934430 0.018990 2190.451168
6 2022-06-02 124.534915 0.019045 2218.257177
7 2022-06-03 123.605252 0.018940 2216.201904
8 2022-06-06 123.132373 0.018857 2205.076383
9 2022-06-07 123.611152 0.018882 2209.227650
📄 CSV: rus_fin_forecast_returns_h10.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 -0.021434 -0.009806 -0.027309
1 2022-05-26 0.003736 -0.004169 -0.010095
2 2022-05-27 0.010692 0.008474 0.015315
3 2022-05-30 0.003186 0.003014 0.013651
4 2022-05-31 -0.021528 -0.013706 -0.018595
5 2022-06-01 0.012000 0.006780 0.006228
6 2022-06-02 0.004833 0.002874 0.012614
7 2022-06-03 -0.007493 -0.005514 -0.000927
8 2022-06-06 -0.003833 -0.004394 -0.005033
9 2022-06-07 0.003881 0.001316 0.001881
📄 CSV: rus_fin_forecast_returns_h100.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 -0.021434 -0.009806 -0.027309
1 2022-05-26 0.003736 -0.004169 -0.010095
2 2022-05-27 0.010692 0.008474 0.015315
3 2022-05-30 0.003186 0.003014 0.013651
4 2022-05-31 -0.021528 -0.013706 -0.018595
5 2022-06-01 0.012000 0.006780 0.006228
6 2022-06-02 0.004833 0.002874 0.012614
7 2022-06-03 -0.007493 -0.005514 -0.000927
8 2022-06-06 -0.003833 -0.004394 -0.005033
9 2022-06-07 0.003881 0.001316 0.001881
📄 CSV: rus_fin_forecast_returns_h20.csv
Unnamed: 0 SBER.ME VTBR.ME TCSG.ME
0 2022-05-25 -0.021434 -0.009806 -0.027309
1 2022-05-26 0.003736 -0.004169 -0.010095
2 2022-05-27 0.010692 0.008474 0.015315
3 2022-05-30 0.003186 0.003014 0.013651
4 2022-05-31 -0.021528 -0.013706 -0.018595
5 2022-06-01 0.012000 0.006780 0.006228
6 2022-06-02 0.004833 0.002874 0.012614
7 2022-06-03 -0.007493 -0.005514 -0.000927
8 2022-06-06 -0.003833 -0.004394 -0.005033
9 2022-06-07 0.003881 0.001316 0.001881
📄 CSV: rus_fin_rolling_eval.csv
ticker RMSE MAE
0 SBER.ME 0.088903 0.044799
1 VTBR.ME 0.076988 0.040065
2 TCSG.ME 0.105329 0.062755
📄 CSV: rus_met_VAR_coeffs.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 const 0.001201 0.000496 0.001027
1 L1.NLMK.ME -0.094152 0.009384 -0.003245
2 L1.GMKN.ME 0.019060 0.039199 0.028179
3 L1.CHMF.ME 0.087085 0.016557 -0.053390
📄 CSV: rus_met_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 rus_met False 21.479943 NaN NaN NaN 1.99911 rus_met 1515 3 1
📄 CSV: rus_met_forecast_prices_h10.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 161.593204 21293.884481 1153.893576
1 2022-05-26 161.829379 21305.191851 1154.894727
2 2022-05-27 162.015516 21316.811296 1156.040182
3 2022-05-30 162.208414 21328.430669 1157.180743
4 2022-05-31 162.400849 21340.062364 1158.322589
5 2022-06-01 162.593581 21351.699833 1159.465580
6 2022-06-02 162.786535 21363.343715 1160.609694
7 2022-06-03 162.979719 21374.993940 1161.754938
8 2022-06-06 163.173132 21386.650520 1162.901312
9 2022-06-07 163.366774 21398.313456 1164.048818
📄 CSV: rus_met_forecast_prices_h100.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 161.593204 21293.884481 1153.893576
1 2022-05-26 161.829379 21305.191851 1154.894727
2 2022-05-27 162.015516 21316.811296 1156.040182
3 2022-05-30 162.208414 21328.430669 1157.180743
4 2022-05-31 162.400849 21340.062364 1158.322589
5 2022-06-01 162.593581 21351.699833 1159.465580
6 2022-06-02 162.786535 21363.343715 1160.609694
7 2022-06-03 162.979719 21374.993940 1161.754938
8 2022-06-06 163.173132 21386.650520 1162.901312
9 2022-06-07 163.366774 21398.313456 1164.048818
📄 CSV: rus_met_forecast_prices_h20.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 161.593204 21293.884481 1153.893576
1 2022-05-26 161.829379 21305.191851 1154.894727
2 2022-05-27 162.015516 21316.811296 1156.040182
3 2022-05-30 162.208414 21328.430669 1157.180743
4 2022-05-31 162.400849 21340.062364 1158.322589
5 2022-06-01 162.593581 21351.699833 1159.465580
6 2022-06-02 162.786535 21363.343715 1160.609694
7 2022-06-03 162.979719 21374.993940 1161.754938
8 2022-06-06 163.173132 21386.650520 1162.901312
9 2022-06-07 163.366774 21398.313456 1164.048818
📄 CSV: rus_met_forecast_returns_h10.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 -0.000166 -0.000287 0.002858
1 2022-05-26 0.001460 0.000531 0.000867
2 2022-05-27 0.001150 0.000545 0.000991
3 2022-05-30 0.001190 0.000545 0.000986
4 2022-05-31 0.001186 0.000545 0.000986
5 2022-06-01 0.001186 0.000545 0.000986
6 2022-06-02 0.001186 0.000545 0.000986
7 2022-06-03 0.001186 0.000545 0.000986
8 2022-06-06 0.001186 0.000545 0.000986
9 2022-06-07 0.001186 0.000545 0.000986
📄 CSV: rus_met_forecast_returns_h100.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 -0.000166 -0.000287 0.002858
1 2022-05-26 0.001460 0.000531 0.000867
2 2022-05-27 0.001150 0.000545 0.000991
3 2022-05-30 0.001190 0.000545 0.000986
4 2022-05-31 0.001186 0.000545 0.000986
5 2022-06-01 0.001186 0.000545 0.000986
6 2022-06-02 0.001186 0.000545 0.000986
7 2022-06-03 0.001186 0.000545 0.000986
8 2022-06-06 0.001186 0.000545 0.000986
9 2022-06-07 0.001186 0.000545 0.000986
📄 CSV: rus_met_forecast_returns_h20.csv
Unnamed: 0 NLMK.ME GMKN.ME CHMF.ME
0 2022-05-25 -0.000166 -0.000287 0.002858
1 2022-05-26 0.001460 0.000531 0.000867
2 2022-05-27 0.001150 0.000545 0.000991
3 2022-05-30 0.001190 0.000545 0.000986
4 2022-05-31 0.001186 0.000545 0.000986
5 2022-06-01 0.001186 0.000545 0.000986
6 2022-06-02 0.001186 0.000545 0.000986
7 2022-06-03 0.001186 0.000545 0.000986
8 2022-06-06 0.001186 0.000545 0.000986
9 2022-06-07 0.001186 0.000545 0.000986
📄 CSV: rus_met_rolling_eval.csv
ticker RMSE MAE
0 NLMK.ME 0.037225 0.020606
1 GMKN.ME 0.032015 0.017939
2 CHMF.ME 0.046574 0.025976
📄 CSV: rus_oil_VAR_coeffs.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 const 0.000783 0.000940 0.000618
1 L1.GAZP.ME 0.077234 -0.011864 0.000381
2 L1.LKOH.ME -0.026727 -0.040125 -0.000113
3 L1.ROSN.ME 0.001376 0.082202 0.119774
4 L2.GAZP.ME 0.027767 0.046319 -0.003363
5 L2.LKOH.ME -0.052866 -0.113622 -0.021621
6 L2.ROSN.ME 0.006264 0.014220 -0.000542
📄 CSV: rus_oil_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 rus_oil False 24.71795 NaN NaN NaN 1.995539 rus_oil 1515 3 2
📄 CSV: rus_oil_forecast_prices_h10.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 266.451624 4405.437944 398.788036
1 2022-05-26 266.793955 4414.768234 399.106213
2 2022-05-27 267.036984 4419.608328 399.412436
3 2022-05-30 267.238300 4423.049869 399.675929
4 2022-05-31 267.450351 4426.954081 399.943765
5 2022-06-01 267.665794 4430.965776 400.215263
6 2022-06-02 267.880335 4434.935890 400.486407
7 2022-06-03 268.094808 4438.902281 400.757450
8 2022-06-06 268.309559 4442.876534 401.028735
9 2022-06-07 268.524498 4446.854806 401.300227
📄 CSV: rus_oil_forecast_prices_h100.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 266.451624 4405.437944 398.788036
1 2022-05-26 266.793955 4414.768234 399.106213
2 2022-05-27 267.036984 4419.608328 399.412436
3 2022-05-30 267.238300 4423.049869 399.675929
4 2022-05-31 267.450351 4426.954081 399.943765
5 2022-06-01 267.665794 4430.965776 400.215263
6 2022-06-02 267.880335 4434.935890 400.486407
7 2022-06-03 268.094808 4438.902281 400.757450
8 2022-06-06 268.309559 4442.876534 401.028735
9 2022-06-07 268.524498 4446.854806 401.300227
📄 CSV: rus_oil_forecast_prices_h20.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 266.451624 4405.437944 398.788036
1 2022-05-26 266.793955 4414.768234 399.106213
2 2022-05-27 267.036984 4419.608328 399.412436
3 2022-05-30 267.238300 4423.049869 399.675929
4 2022-05-31 267.450351 4426.954081 399.943765
5 2022-06-01 267.665794 4430.965776 400.215263
6 2022-06-02 267.880335 4434.935890 400.486407
7 2022-06-03 268.094808 4438.902281 400.757450
8 2022-06-06 268.309559 4442.876534 401.028735
9 2022-06-07 268.524498 4446.854806 401.300227
📄 CSV: rus_oil_forecast_returns_h10.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 -0.000857 -0.002281 -0.002034
1 2022-05-26 0.001284 0.002116 0.000798
2 2022-05-27 0.000911 0.001096 0.000767
3 2022-05-30 0.000754 0.000778 0.000659
4 2022-05-31 0.000793 0.000882 0.000670
5 2022-06-01 0.000805 0.000906 0.000679
6 2022-06-02 0.000801 0.000896 0.000677
7 2022-06-03 0.000800 0.000894 0.000677
8 2022-06-06 0.000801 0.000895 0.000677
9 2022-06-07 0.000801 0.000895 0.000677
📄 CSV: rus_oil_forecast_returns_h100.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 -0.000857 -0.002281 -0.002034
1 2022-05-26 0.001284 0.002116 0.000798
2 2022-05-27 0.000911 0.001096 0.000767
3 2022-05-30 0.000754 0.000778 0.000659
4 2022-05-31 0.000793 0.000882 0.000670
5 2022-06-01 0.000805 0.000906 0.000679
6 2022-06-02 0.000801 0.000896 0.000677
7 2022-06-03 0.000800 0.000894 0.000677
8 2022-06-06 0.000801 0.000895 0.000677
9 2022-06-07 0.000801 0.000895 0.000677
📄 CSV: rus_oil_forecast_returns_h20.csv
Unnamed: 0 GAZP.ME LKOH.ME ROSN.ME
0 2022-05-25 -0.000857 -0.002281 -0.002034
1 2022-05-26 0.001284 0.002116 0.000798
2 2022-05-27 0.000911 0.001096 0.000767
3 2022-05-30 0.000754 0.000778 0.000659
4 2022-05-31 0.000793 0.000882 0.000670
5 2022-06-01 0.000805 0.000906 0.000679
6 2022-06-02 0.000801 0.000896 0.000677
7 2022-06-03 0.000800 0.000894 0.000677
8 2022-06-06 0.000801 0.000895 0.000677
9 2022-06-07 0.000801 0.000895 0.000677
📄 CSV: rus_oil_rolling_eval.csv
ticker RMSE MAE
0 GAZP.ME 0.059383 0.030505
1 LKOH.ME 0.059952 0.028453
2 ROSN.ME 0.065664 0.028774
📄 CSV: us_energy_VAR_coeffs.csv
Unnamed: 0 XOM CVX COP
0 const 0.000329 0.000374 0.000383
1 L1.XOM 0.112250 0.097129 0.166683
2 L1.CVX -0.135727 -0.165454 -0.251606
3 L1.COP -0.012350 0.006618 0.011816
4 L2.XOM -0.095013 -0.115642 -0.027888
5 L2.CVX 0.072186 0.109733 0.037841
6 L2.COP 0.039296 0.040802 0.018680
7 L3.XOM -0.097785 -0.016026 -0.072616
8 L3.CVX -0.017596 -0.061729 -0.068044
9 L3.COP 0.069671 0.053246 0.105017
📄 CSV: us_energy_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 us_energy False 3.077592 NaN NaN NaN 1.994895 us_energy 2413 3 7
📄 CSV: us_energy_forecast_prices_h10.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 117.061793 150.224186 87.295047
1 2025-11-24 117.437936 150.395922 87.395788
2 2025-11-25 117.317412 150.139710 87.265534
3 2025-11-26 117.386376 150.384276 87.415440
4 2025-11-27 117.360757 150.412089 87.407909
5 2025-11-28 117.382439 150.373188 87.376548
6 2025-12-01 117.625827 150.906205 87.667384
7 2025-12-02 117.612495 150.851548 87.625205
8 2025-12-03 117.664964 150.969592 87.677646
9 2025-12-04 117.664591 150.937111 87.669042
📄 CSV: us_energy_forecast_prices_h100.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 117.061793 150.224186 87.295047
1 2025-11-24 117.437936 150.395922 87.395788
2 2025-11-25 117.317412 150.139710 87.265534
3 2025-11-26 117.386376 150.384276 87.415440
4 2025-11-27 117.360757 150.412089 87.407909
5 2025-11-28 117.382439 150.373188 87.376548
6 2025-12-01 117.625827 150.906205 87.667384
7 2025-12-02 117.612495 150.851548 87.625205
8 2025-12-03 117.664964 150.969592 87.677646
9 2025-12-04 117.664591 150.937111 87.669042
📄 CSV: us_energy_forecast_prices_h20.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 117.061793 150.224186 87.295047
1 2025-11-24 117.437936 150.395922 87.395788
2 2025-11-25 117.317412 150.139710 87.265534
3 2025-11-26 117.386376 150.384276 87.415440
4 2025-11-27 117.360757 150.412089 87.407909
5 2025-11-28 117.382439 150.373188 87.376548
6 2025-12-01 117.625827 150.906205 87.667384
7 2025-12-02 117.612495 150.851548 87.625205
8 2025-12-03 117.664964 150.969592 87.677646
9 2025-12-04 117.664591 150.937111 87.669042
📄 CSV: us_energy_forecast_returns_h10.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 0.000357 -0.000571 -0.002002
1 2025-11-24 0.003208 0.001143 0.001153
2 2025-11-25 -0.001027 -0.001705 -0.001492
3 2025-11-26 0.000588 0.001628 0.001716
4 2025-11-27 -0.000218 0.000185 -0.000086
5 2025-11-28 0.000185 -0.000259 -0.000359
6 2025-12-01 0.002071 0.003538 0.003323
7 2025-12-02 -0.000113 -0.000362 -0.000481
8 2025-12-03 0.000446 0.000782 0.000598
9 2025-12-04 -0.000003 -0.000215 -0.000098
📄 CSV: us_energy_forecast_returns_h100.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 0.000357 -0.000571 -0.002002
1 2025-11-24 0.003208 0.001143 0.001153
2 2025-11-25 -0.001027 -0.001705 -0.001492
3 2025-11-26 0.000588 0.001628 0.001716
4 2025-11-27 -0.000218 0.000185 -0.000086
5 2025-11-28 0.000185 -0.000259 -0.000359
6 2025-12-01 0.002071 0.003538 0.003323
7 2025-12-02 -0.000113 -0.000362 -0.000481
8 2025-12-03 0.000446 0.000782 0.000598
9 2025-12-04 -0.000003 -0.000215 -0.000098
📄 CSV: us_energy_forecast_returns_h20.csv
Unnamed: 0 XOM CVX COP
0 2025-11-21 0.000357 -0.000571 -0.002002
1 2025-11-24 0.003208 0.001143 0.001153
2 2025-11-25 -0.001027 -0.001705 -0.001492
3 2025-11-26 0.000588 0.001628 0.001716
4 2025-11-27 -0.000218 0.000185 -0.000086
5 2025-11-28 0.000185 -0.000259 -0.000359
6 2025-12-01 0.002071 0.003538 0.003323
7 2025-12-02 -0.000113 -0.000362 -0.000481
8 2025-12-03 0.000446 0.000782 0.000598
9 2025-12-04 -0.000003 -0.000215 -0.000098
📄 CSV: us_energy_rolling_eval.csv
ticker RMSE MAE
0 XOM 0.011902 0.009574
1 CVX 0.012480 0.009807
2 COP 0.016086 0.012614
📄 CSV: us_retail_VAR_coeffs.csv
Unnamed: 0 WMT COST TGT
0 const 0.000772 0.000845 0.000274
1 L1.WMT -0.040524 0.000503 0.014336
2 L1.COST 0.004596 -0.032054 -0.024830
3 L1.TGT -0.034155 -0.002398 -0.018063
📄 CSV: us_retail_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 us_retail False 33.784002 NaN NaN NaN 1.999501 us_retail 2413 3 1
📄 CSV: us_retail_forecast_prices_h10.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 106.982471 894.052621 83.639320
1 2025-11-24 107.072446 894.784533 83.659800
2 2025-11-25 107.150979 895.517396 83.681690
3 2025-11-26 107.229977 896.250763 83.703431
4 2025-11-27 107.309022 896.984739 83.725185
5 2025-11-28 107.388126 897.719316 83.746945
6 2025-12-01 107.467289 898.454495 83.768710
7 2025-12-02 107.546509 899.190275 83.790481
8 2025-12-03 107.625788 899.926659 83.812257
9 2025-12-04 107.705126 900.663645 83.834039
📄 CSV: us_retail_forecast_prices_h100.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 106.982471 894.052621 83.639320
1 2025-11-24 107.072446 894.784533 83.659800
2 2025-11-25 107.150979 895.517396 83.681690
3 2025-11-26 107.229977 896.250763 83.703431
4 2025-11-27 107.309022 896.984739 83.725185
5 2025-11-28 107.388126 897.719316 83.746945
6 2025-12-01 107.467289 898.454495 83.768710
7 2025-12-02 107.546509 899.190275 83.790481
8 2025-12-03 107.625788 899.926659 83.812257
9 2025-12-04 107.705126 900.663645 83.834039
📄 CSV: us_retail_forecast_prices_h20.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 106.982471 894.052621 83.639320
1 2025-11-24 107.072446 894.784533 83.659800
2 2025-11-25 107.150979 895.517396 83.681690
3 2025-11-26 107.229977 896.250763 83.703431
4 2025-11-27 107.309022 896.984739 83.725185
5 2025-11-28 107.388126 897.719316 83.746945
6 2025-12-01 107.467289 898.454495 83.768710
7 2025-12-02 107.546509 899.190275 83.790481
8 2025-12-03 107.625788 899.926659 83.812257
9 2025-12-04 107.705126 900.663645 83.834039
📄 CSV: us_retail_forecast_returns_h10.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 -0.001191 0.000853 -0.000486
1 2025-11-24 0.000841 0.000818 0.000245
2 2025-11-25 0.000733 0.000819 0.000262
3 2025-11-26 0.000737 0.000819 0.000260
4 2025-11-27 0.000737 0.000819 0.000260
5 2025-11-28 0.000737 0.000819 0.000260
6 2025-12-01 0.000737 0.000819 0.000260
7 2025-12-02 0.000737 0.000819 0.000260
8 2025-12-03 0.000737 0.000819 0.000260
9 2025-12-04 0.000737 0.000819 0.000260
📄 CSV: us_retail_forecast_returns_h100.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 -0.001191 0.000853 -0.000486
1 2025-11-24 0.000841 0.000818 0.000245
2 2025-11-25 0.000733 0.000819 0.000262
3 2025-11-26 0.000737 0.000819 0.000260
4 2025-11-27 0.000737 0.000819 0.000260
5 2025-11-28 0.000737 0.000819 0.000260
6 2025-12-01 0.000737 0.000819 0.000260
7 2025-12-02 0.000737 0.000819 0.000260
8 2025-12-03 0.000737 0.000819 0.000260
9 2025-12-04 0.000737 0.000819 0.000260
📄 CSV: us_retail_forecast_returns_h20.csv
Unnamed: 0 WMT COST TGT
0 2025-11-21 -0.001191 0.000853 -0.000486
1 2025-11-24 0.000841 0.000818 0.000245
2 2025-11-25 0.000733 0.000819 0.000262
3 2025-11-26 0.000737 0.000819 0.000260
4 2025-11-27 0.000737 0.000819 0.000260
5 2025-11-28 0.000737 0.000819 0.000260
6 2025-12-01 0.000737 0.000819 0.000260
7 2025-12-02 0.000737 0.000819 0.000260
8 2025-12-03 0.000737 0.000819 0.000260
9 2025-12-04 0.000737 0.000819 0.000260
📄 CSV: us_retail_rolling_eval.csv
ticker RMSE MAE
0 WMT 0.013503 0.008798
1 COST 0.010649 0.008172
2 TGT 0.017805 0.013944
📄 CSV: us_tech_VAR_coeffs.csv
Unnamed: 0 AAPL MSFT NVDA
0 const 0.000766 0.001104 0.002273
1 L1.AAPL 0.001441 -0.046184 -0.031821
2 L1.MSFT -0.105998 -0.151740 -0.165332
3 L1.NVDA 0.019841 0.034317 0.001764
4 L2.AAPL 0.008761 -0.013327 -0.011733
5 L2.MSFT -0.077339 -0.058800 -0.063556
6 L2.NVDA 0.045931 0.026213 0.057473
7 L3.AAPL -0.082228 -0.088988 -0.108386
8 L3.MSFT 0.095441 0.047166 0.115579
9 L3.NVDA -0.005091 0.009845 -0.012694
📄 CSV: us_tech_diagnostics.csv
Unnamed: 0 stable max_root_modulus serial_pvalue normality_pvalue arch_pvalue dw_mean sector n_obs_train n_vars chosen_lag
0 us_tech False 1.730068 NaN NaN NaN 1.996385 us_tech 2413 3 9
📄 CSV: us_tech_forecast_prices_h10.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 267.128394 478.730423 181.688619
1 2025-11-24 267.215133 478.337166 181.904773
2 2025-11-25 266.775108 477.680615 181.839489
3 2025-11-26 267.044446 477.808840 182.450524
4 2025-11-27 267.200126 478.600831 182.780324
5 2025-11-28 267.688967 480.125973 183.106679
6 2025-12-01 267.285250 480.268962 182.933760
7 2025-12-02 267.900427 482.052664 183.944021
8 2025-12-03 268.250598 482.707000 184.616566
9 2025-12-04 268.552026 483.243189 185.102873
📄 CSV: us_tech_forecast_prices_h100.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 267.128394 478.730423 181.688619
1 2025-11-24 267.215133 478.337166 181.904773
2 2025-11-25 266.775108 477.680615 181.839489
3 2025-11-26 267.044446 477.808840 182.450524
4 2025-11-27 267.200126 478.600831 182.780324
5 2025-11-28 267.688967 480.125973 183.106679
6 2025-12-01 267.285250 480.268962 182.933760
7 2025-12-02 267.900427 482.052664 183.944021
8 2025-12-03 268.250598 482.707000 184.616566
9 2025-12-04 268.552026 483.243189 185.102873
📄 CSV: us_tech_forecast_prices_h20.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 267.128394 478.730423 181.688619
1 2025-11-24 267.215133 478.337166 181.904773
2 2025-11-25 266.775108 477.680615 181.839489
3 2025-11-26 267.044446 477.808840 182.450524
4 2025-11-27 267.200126 478.600831 182.780324
5 2025-11-28 267.688967 480.125973 183.106679
6 2025-12-01 267.285250 480.268962 182.933760
7 2025-12-02 267.900427 482.052664 183.944021
8 2025-12-03 268.250598 482.707000 184.616566
9 2025-12-04 268.552026 483.243189 185.102873
📄 CSV: us_tech_forecast_returns_h10.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 0.003294 0.000628 0.005788
1 2025-11-24 0.000325 -0.000822 0.001189
2 2025-11-25 -0.001648 -0.001374 -0.000359
3 2025-11-26 0.001009 0.000268 0.003355
4 2025-11-27 0.000583 0.001656 0.001806
5 2025-11-28 0.001828 0.003182 0.001784
6 2025-12-01 -0.001509 0.000298 -0.000945
7 2025-12-02 0.002299 0.003707 0.005507
8 2025-12-03 0.001306 0.001356 0.003650
9 2025-12-04 0.001123 0.001110 0.002631
📄 CSV: us_tech_forecast_returns_h100.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 0.003294 0.000628 0.005788
1 2025-11-24 0.000325 -0.000822 0.001189
2 2025-11-25 -0.001648 -0.001374 -0.000359
3 2025-11-26 0.001009 0.000268 0.003355
4 2025-11-27 0.000583 0.001656 0.001806
5 2025-11-28 0.001828 0.003182 0.001784
6 2025-12-01 -0.001509 0.000298 -0.000945
7 2025-12-02 0.002299 0.003707 0.005507
8 2025-12-03 0.001306 0.001356 0.003650
9 2025-12-04 0.001123 0.001110 0.002631
📄 CSV: us_tech_forecast_returns_h20.csv
Unnamed: 0 AAPL MSFT NVDA
0 2025-11-21 0.003294 0.000628 0.005788
1 2025-11-24 0.000325 -0.000822 0.001189
2 2025-11-25 -0.001648 -0.001374 -0.000359
3 2025-11-26 0.001009 0.000268 0.003355
4 2025-11-27 0.000583 0.001656 0.001806
5 2025-11-28 0.001828 0.003182 0.001784
6 2025-12-01 -0.001509 0.000298 -0.000945
7 2025-12-02 0.002299 0.003707 0.005507
8 2025-12-03 0.001306 0.001356 0.003650
9 2025-12-04 0.001123 0.001110 0.002631
📄 CSV: us_tech_rolling_eval.csv
ticker RMSE MAE
0 AAPL 0.014857 0.010576
1 MSFT 0.011791 0.008799
2 NVDA 0.021276 0.016575
===== TXT summary =====


📜 TXT: rus_fin_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:05:47
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -22.9311
Nobs:                     321.000    HQIC:                  -23.3969
Log likelihood:           2504.46    FPE:                5.06594e-11
AIC:                     -23.7065    Det(Omega_mle):     4.15234e-11
--------------------------------------------------------------------
Results for equation SBER.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const             -0.000812         0.001116           -0.728           0.466
L1.SBER.ME        -0.067880         0.084686           -0.802           0.423
L1.VTBR.ME         0.050555         0.088692            0.570           0.569
L1.TCSG.ME         0.007783         0.040678            0.191           0.848
L2.SBER.ME        -0.008981         0.083946           -0.107           0.915
L2.VTBR.ME        -0.066138         0.089235           -0.741           0.459
L2.TCSG.ME         0.018146         0.040709            0.446           0.656
L3.SBER.ME        -0.172843         0.084269           -2.051           0.040
L3.VTBR.ME        -0.015094         0.088766           -0.170           0.865
L3.TCSG.ME         0.177597         0.040526            4.382           0.000
L4.SBER.ME         0.336753         0.084832            3.970           0.000
L4.VTBR.ME        -0.215702         0.088069           -2.449           0.014
L4.TCSG.ME        -0.067918         0.042639           -1.593           0.111
L5.SBER.ME         0.216999         0.086413            2.511           0.012
L5.VTBR.ME        -0.

📜 TXT: rus_met_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:05:50
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.1609
Nobs:                     1514.00    HQIC:                  -25.1873
Log likelihood:           12645.9    FPE:                1.13358e-11
AIC:                     -25.2031    Det(Omega_mle):     1.12464e-11
--------------------------------------------------------------------
Results for equation NLMK.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.001201         0.000445            2.700           0.007
L1.NLMK.ME        -0.094152         0.032943           -2.858           0.004
L1.GMKN.ME         0.019060         0.027577            0.691           0.489
L1.CHMF.ME         0.087085         0.037644            2.313           0.021
=============================================================================

Results for equation GMKN.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000496         0.000447            1.109           0.267
L1.NLMK.ME         0.009384         0.033126            0.283           0.777
L1.GMKN.ME         0.039199         0.027730            1.414           0.157
L1.CHMF.ME         0.016557         0.037853            0.437           0.662
=============================================================================

Results for equation CHMF.ME
=======================================

📜 TXT: rus_oil_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:05:49
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.5014
Nobs:                     1513.00    HQIC:                  -25.5478
Log likelihood:           12928.1    FPE:                7.81250e-12
AIC:                     -25.5753    Det(Omega_mle):     7.70506e-12
--------------------------------------------------------------------
Results for equation GAZP.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000783         0.000392            1.995           0.046
L1.GAZP.ME         0.077234         0.031948            2.417           0.016
L1.LKOH.ME        -0.026727         0.031154           -0.858           0.391
L1.ROSN.ME         0.001376         0.032071            0.043           0.966
L2.GAZP.ME         0.027767         0.031925            0.870           0.384
L2.LKOH.ME        -0.052866         0.031085           -1.701           0.089
L2.ROSN.ME         0.006264         0.031922            0.196           0.844
=============================================================================

Results for equation LKOH.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000940         0.000457            2.055           0.040
L1.GAZP.ME        -0.011864         0.037231           -0.319           0.750
L1.LKOH.ME        -0.040125         0.036307           -1.105        

📜 TXT: us_energy_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:06:10
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.7027
Nobs:                     2406.00    HQIC:                  -25.8037
Log likelihood:           20935.4    FPE:                5.86841e-12
AIC:                     -25.8614    Det(Omega_mle):     5.71033e-12
--------------------------------------------------------------------
Results for equation XOM
=========================================================================
            coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------
const          0.000329         0.000359            0.917           0.359
L1.XOM         0.112250         0.039396            2.849           0.004
L1.CVX        -0.135727         0.038881           -3.491           0.000
L1.COP        -0.012350         0.026862           -0.460           0.646
L2.XOM        -0.095013         0.039435           -2.409           0.016
L2.CVX         0.072186         0.038921            1.855           0.064
L2.COP         0.039296         0.026859            1.463           0.143
L3.XOM        -0.097785         0.039545           -2.473           0.013
L3.CVX        -0.017596         0.038948           -0.452           0.651
L3.COP         0.069671         0.026837            2.596           0.009
L4.XOM         0.051083         0.039649            1.288           0.198
L4.CVX        -0.124363         0.038951           -3.193           0.001
L4.COP         0.067654         0.026853            2.519           0.012
L5.XOM         0.047316         0.039589            1.195           0.232
L5.CVX        -0.101167         0.039025           -2.592           0.010
L5.COP         0.07

📜 TXT: us_retail_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:06:03
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.4775
Nobs:                     2412.00    HQIC:                  -25.4958
Log likelihood:           20505.2    FPE:                8.37039e-12
AIC:                     -25.5063    Det(Omega_mle):     8.32889e-12
--------------------------------------------------------------------
Results for equation WMT
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000772         0.000276            2.797           0.005
L1.WMT         -0.040524         0.025221           -1.607           0.108
L1.COST         0.004596         0.024807            0.185           0.853
L1.TGT         -0.034155         0.014933           -2.287           0.022
==========================================================================

Results for equation COST
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000845         0.000285            2.960           0.003
L1.WMT          0.000503         0.026094            0.019           0.985
L1.COST        -0.032054         0.025666           -1.249           0.212
L1.TGT         -0.002398         0.015450           -0.155           0.877
==========================================================================

Results for equation TGT
==========================================================================
             coefficien

📜 TXT: us_tech_VAR_summary.txt
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:05:52
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -24.0541
Nobs:                     2404.00    HQIC:                  -24.1827
Log likelihood:           19006.6    FPE:                2.92194e-11
AIC:                     -24.2562    Det(Omega_mle):     2.82217e-11
--------------------------------------------------------------------
Results for equation AAPL
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000766         0.000381            2.008           0.045
L1.AAPL         0.001441         0.028744            0.050           0.960
L1.MSFT        -0.105998         0.033638           -3.151           0.002
L1.NVDA         0.019841         0.015473            1.282           0.200
L2.AAPL         0.008761         0.028780            0.304           0.761
L2.MSFT        -0.077339         0.033760           -2.291           0.022
L2.NVDA         0.045931         0.015502            2.963           0.003
L3.AAPL        -0.082228         0.028683           -2.867           0.004
L3.MSFT         0.095441         0.033679            2.834           0.005
L3.NVDA        -0.005091         0.015520           -0.328           0.743
L4.AAPL        -0.046314         0.028751           -1.611           0.107
L4.MSFT         0.053454         0.033728            1.585           0.113
L4.NVDA        -0.004731         0.015519           -0.305           0.760
L5.AAPL         0.014151         0.028740            0.492           0.622
L5.MSFT         0.012282         0.033731            0.364           0.716


===== PNG визуализации =====


📌 Showing: rus_fin_FEVD.png
No description has been provided for this image
📌 Showing: rus_fin_IRF.png
No description has been provided for this image
📌 Showing: rus_fin_resid_SBER.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_fin_resid_TCSG.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_fin_resid_VTBR.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_met_FEVD.png
No description has been provided for this image
📌 Showing: rus_met_IRF.png
No description has been provided for this image
📌 Showing: rus_met_resid_CHMF.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_met_resid_GMKN.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_met_resid_NLMK.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_oil_FEVD.png
No description has been provided for this image
📌 Showing: rus_oil_IRF.png
No description has been provided for this image
📌 Showing: rus_oil_resid_GAZP.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_oil_resid_LKOH.ME_hist_acf.png
No description has been provided for this image
📌 Showing: rus_oil_resid_ROSN.ME_hist_acf.png
No description has been provided for this image
📌 Showing: us_energy_FEVD.png
No description has been provided for this image
📌 Showing: us_energy_IRF.png
No description has been provided for this image
📌 Showing: us_energy_resid_COP_hist_acf.png
No description has been provided for this image
📌 Showing: us_energy_resid_CVX_hist_acf.png
No description has been provided for this image
📌 Showing: us_energy_resid_XOM_hist_acf.png
No description has been provided for this image
📌 Showing: us_retail_FEVD.png
No description has been provided for this image
📌 Showing: us_retail_IRF.png
No description has been provided for this image
📌 Showing: us_retail_resid_COST_hist_acf.png
No description has been provided for this image
📌 Showing: us_retail_resid_TGT_hist_acf.png
No description has been provided for this image
📌 Showing: us_retail_resid_WMT_hist_acf.png
No description has been provided for this image
📌 Showing: us_tech_FEVD.png
No description has been provided for this image
📌 Showing: us_tech_IRF.png
No description has been provided for this image
📌 Showing: us_tech_resid_AAPL_hist_acf.png
No description has been provided for this image
📌 Showing: us_tech_resid_MSFT_hist_acf.png
No description has been provided for this image
📌 Showing: us_tech_resid_NVDA_hist_acf.png
No description has been provided for this image

Проверим тест Йохансена на проверку коинтегрирующих векторов¶

In [37]:
# johansen_per_sector.py
import os
import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.vecm import coint_johansen

SECTORS = {
    "rus_fin": "rus_fin_prices.csv",
    "rus_oil": "rus_oil_prices.csv",
    "rus_met": "rus_met_prices.csv",
    "us_tech": "us_tech_prices.csv",
    "us_retail": "us_retail_prices.csv",
    "us_energy": "us_energy_prices.csv"
}
OUT = "VECM_results"
os.makedirs(OUT, exist_ok=True)

def johansen_test_df(df, det_order=0, k_ar_diff=1):
    # df must be levels, no NaN
    res = coint_johansen(df, det_order, k_ar_diff)
    # trace statistics res.lr1, critical values res.cvt
    trace = res.lr1
    cvt = res.cvt  # rows ~ ranks, cols 90/95/99
    return res, trace, cvt

summary_rows = []
for name, path in SECTORS.items():
    try:
        df = pd.read_csv(path, index_col=0, parse_dates=True)
        df = df.dropna(how='any')
        print(f"\n{name}: shape {df.shape}")
        # choose k_ar_diff = lag used in VECM = (lag selected for VAR on returns)
        # safe default:
        k_ar_diff = 1
        res, trace, cvt = johansen_test_df(df, det_order=0, k_ar_diff=k_ar_diff)
        # determine rank by comparing trace with 95% critical values
        rank = 0
        for i, tr in enumerate(trace):
            crit95 = cvt[i,1]  # 95% (index 1)
            if tr > crit95:
                rank = i+1
        # save summary
        summary_rows.append({"sector": name, "n_obs": df.shape[0], "n_vars": df.shape[1], "coint_rank": rank})
        # write detailed output
        with open(os.path.join(OUT, f"{name}_johansen.txt"), "w", encoding="utf-8") as f:
            f.write("Trace stats:\n")
            for i, t in enumerate(trace):
                f.write(f"r <= {i} : trace_stat = {t:.4f}, crit90={cvt[i,0]}, crit95={cvt[i,1]}, crit99={cvt[i,2]}\n")
            f.write("\nEigenvectors (res.evec):\n")
            f.write(str(res.evec))
        print(f"{name} -> estimated coint_rank = {rank}")
    except Exception as e:
        print("Failed", name, e)

pd.DataFrame(summary_rows).to_csv(os.path.join(OUT, "johansen_summary.csv"), index=False)
print("\nSaved johansen_summary.csv")
rus_fin: shape (626, 3)
rus_fin -> estimated coint_rank = 0

rus_oil: shape (1616, 3)
rus_oil -> estimated coint_rank = 0

rus_met: shape (1616, 3)
rus_met -> estimated coint_rank = 0

us_tech: shape (2514, 3)
us_tech -> estimated coint_rank = 1

us_retail: shape (2514, 3)
us_retail -> estimated coint_rank = 0

us_energy: shape (2514, 3)
us_energy -> estimated coint_rank = 0

Saved johansen_summary.csv

Построим VECM¶

In [38]:
# build_vecm_for_ranked.py
import os
import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.vecm import VECM
from datetime import datetime

IN_DIR = "VECM_results"
os.makedirs(IN_DIR, exist_ok=True)

johansen_summary = pd.read_csv(os.path.join(IN_DIR, "johansen_summary.csv"))
for row in johansen_summary.itertuples(index=False):
    name = row.sector
    rank = int(row.coint_rank)
    if rank <= 0:
        print(f"{name}: no cointegration (rank={rank}), skipping VECM.")
        continue
    # load data
    df = pd.read_csv(f"{name}_prices.csv", index_col=0, parse_dates=True)
    df = df.dropna(how='any')
    print(f"Building VECM for {name}, shape {df.shape}, rank {rank}")

    # choose lag differences (k_ar_diff) — can be tuned; use 1 as default
    k_ar_diff = 1
    try:
        vecm = VECM(df, k_ar_diff=k_ar_diff, coint_rank=rank, deterministic='ci')
        vecm_res = vecm.fit()
        # save summary
        with open(os.path.join(IN_DIR, f"{name}_VECM_summary.txt"), "w", encoding="utf-8") as f:
            try:
                f.write(str(vecm_res.summary()))
            except:
                f.write("VECM fitted. Summary not printable.\n")
        # save alpha and beta
        pd.DataFrame(vecm_res.alpha, index=df.columns).to_csv(os.path.join(IN_DIR, f"{name}_vecm_alpha.csv"))
        pd.DataFrame(vecm_res.beta, index=df.columns).to_csv(os.path.join(IN_DIR, f"{name}_vecm_beta.csv"))
        print(f"Saved VECM results for {name}")

        # Forecasting: steps 10/20/100
        steps_list = [10,20,100]
        for steps in steps_list:
            fc = vecm_res.predict(steps=steps)
            start = df.index[-1] + pd.Timedelta(days=1)
            idx = pd.date_range(start=start, periods=h, freq='B')
            fc_df = pd.DataFrame(fc, index=idx, columns=df.columns)
            fc_df.to_csv(os.path.join(IN_DIR, f"{name}_vecm_forecast_h{steps}.csv"))
    except Exception as e:
        print(f"VECM fit failed for {name}: {e}")
rus_fin: no cointegration (rank=0), skipping VECM.
rus_oil: no cointegration (rank=0), skipping VECM.
rus_met: no cointegration (rank=0), skipping VECM.
Building VECM for us_tech, shape (2514, 3), rank 1
Saved VECM results for us_tech
VECM fit failed for us_tech: Shape of passed values is (10, 3), indices imply (100, 3)
us_retail: no cointegration (rank=0), skipping VECM.
us_energy: no cointegration (rank=0), skipping VECM.
In [39]:
show_csvs(dir='VECM_results')
show_txts(dir='VECM_results')
===== CSV таблицы =====


📄 CSV: johansen_summary.csv
sector n_obs n_vars coint_rank
0 rus_fin 626 3 0
1 rus_oil 1616 3 0
2 rus_met 1616 3 0
3 us_tech 2514 3 1
4 us_retail 2514 3 0
5 us_energy 2514 3 0
📄 CSV: us_tech_vecm_alpha.csv
Unnamed: 0 0
0 AAPL -0.014304
1 MSFT -0.013418
2 NVDA -0.011221
📄 CSV: us_tech_vecm_beta.csv
Unnamed: 0 0
0 AAPL 1.000000
1 MSFT -0.605214
2 NVDA 0.159978
===== TXT summary =====


📜 TXT: rus_fin_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.5218, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 3.9606, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.9101, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 3.18679748e-02 -2.96416098e-03 -1.09273064e-02]
 [-2.22903588e-03  2.58130983e-03 -3.37488058e-03]
 [-1.14552177e-03 -4.38971114e-05 -2.92686747e-04]]

📜 TXT: rus_met_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.1479, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 8.1063, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.4505, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 9.64932376e-02  5.42679164e-02 -2.24327732e-02]
 [ 8.70301883e-05 -3.53601350e-04  1.06998467e-04]
 [-1.68966780e-02 -2.59815708e-03  4.85229896e-03]]

📜 TXT: rus_oil_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.8228, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 5.6565, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.0505, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 0.01626521  0.0204432   0.0288041 ]
 [ 0.00113576 -0.00155161 -0.00058345]
 [-0.02962675  0.00249476 -0.00402397]]

📜 TXT: us_energy_johansen.txt
Trace stats:
r <= 0 : trace_stat = 21.8316, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 8.5973, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 2.7075, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 0.05489743  0.02398857  0.09237557]
 [-0.13488873  0.01556991  0.00956623]
 [ 0.09913617 -0.06809139 -0.07382377]]

📜 TXT: us_retail_johansen.txt
Trace stats:
r <= 0 : trace_stat = 21.1804, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 5.7589, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 0.8685, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 0.10587704  0.06328421 -0.13665006]
 [-0.01226271 -0.00411667  0.00913947]
 [ 0.01191872 -0.01891335 -0.00719376]]

📜 TXT: us_tech_VECM_summary.txt
Det. terms outside the coint. relation & lagged endog. parameters for equation AAPL
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL        0.0604      0.025      2.438      0.015       0.012       0.109
L1.MSFT       -0.0391      0.015     -2.541      0.011      -0.069      -0.009
L1.NVDA       -0.0279      0.029     -0.948      0.343      -0.086       0.030
Det. terms outside the coint. relation & lagged endog. parameters for equation MSFT
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL       -0.0854      0.043     -2.003      0.045      -0.169      -0.002
L1.MSFT       -0.0666      0.026     -2.516      0.012      -0.119      -0.015
L1.NVDA        0.1193      0.051      2.357      0.018       0.020       0.218
Det. terms outside the coint. relation & lagged endog. parameters for equation NVDA
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL        0.0129      0.019      0.689      0.491      -0.024       0.050
L1.MSFT       -0.0110      0.012     -0.938      0.348      -0.034       0.012
L1.NVDA       -0.0992      0.022     -4.447      0.000      -0.143      -0.055
                Loading coefficients (alpha) for equation AAPL                
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1       

📜 TXT: us_tech_johansen.txt
Trace stats:
r <= 0 : trace_stat = 46.6903, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 4.7246, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.8763, crit90=2.7055, crit95=3.8415, crit99=6.6349

Eigenvectors (res.evec):
[[ 0.06196515 -0.01950739 -0.02208805]
 [-0.03742795 -0.00276564  0.01660908]
 [ 0.00935774  0.02621031 -0.03264757]]

Подгон VARMAX с экзогенными переменными¶

In [40]:
# varmax_example.py
import yfinance as yf
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.varmax import VARMAX
import os

# пример для rus_fin
sector_file = "rus_fin_prices.csv"
df_prices = pd.read_csv(sector_file, index_col=0, parse_dates=True).dropna(how='any')

# скачиваем экзогены
exogs = yf.download(["IMOEX.ME", "BRN-USD"], start=df_prices.index[0].strftime("%Y-%m-%d"), end=df_prices.index[-1].strftime("%Y-%m-%d"), progress=False, threads=False)
# Если названия не совпали, подбирай нужные тикеры. В примере может потребоваться заменить ticker на корректный.
# Преобразуем в лог-доходности (exog)
exog_prices = exogs['Adj Close'] if 'Adj Close' in exogs else exogs['Close']
exog_prices = exog_prices.dropna(how='all')
exog_returns = np.log(exog_prices).diff().reindex(df_prices.index).dropna(how='any')

# целевая переменная — лог-доходности сектора
df_returns = np.log(df_prices).diff().dropna(how='any')

# приведём экзогены и endog к одному индексу
common_idx = df_returns.index.intersection(exog_returns.index)
endog = df_returns.loc[common_idx]
exog = exog_returns.loc[common_idx]

# модель VARMAX(p,q) — начинаем с (1,1)
mod = VARMAX(endog, exog=exog, order=(1,1))
res = mod.fit(maxiter=200, disp=False)
print(res.summary())

# Forecast (пример 20)
fc = res.get_forecast(steps=20, exog=exog.iloc[-1:].values.repeat(20, axis=0))
pred = fc.predicted_mean
pred.to_csv(os.path.join("VARMAX_results", "rus_fin_varmax_forecast_h20.csv"))
                                   Statespace Model Results                                  
=============================================================================================
Dep. Variable:     ['SBER.ME', 'VTBR.ME', 'TCSG.ME']   No. Observations:                   93
Model:                                   VARMAX(1,1)   Log Likelihood                 836.162
                                         + intercept   AIC                          -1606.324
Date:                               Sun, 23 Nov 2025   BIC                          -1522.748
Time:                                       21:08:34   HQIC                         -1572.579
Sample:                                            0                                         
                                                - 93                                         
Covariance Type:                                 opg                                         
===================================================================================
Ljung-Box (L1) (Q):       0.00, 0.01, 0.00   Jarque-Bera (JB):     0.35, 1.19, 1.48
Prob(Q):                  0.99, 0.92, 0.99   Prob(JB):             0.84, 0.55, 0.48
Heteroskedasticity (H):   0.71, 0.80, 1.57   Skew:                0.01, 0.09, -0.17
Prob(H) (two-sided):      0.35, 0.54, 0.21   Kurtosis:             2.70, 2.48, 2.48
                             Results for equation SBER.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.SBER         0.0006      0.001      0.456      0.648      -0.002       0.003
L1.SBER.ME.SBER       -0.2008      0.452     -0.444      0.657      -1.088       0.686
L1.VTBR.ME.SBER        0.1947      0.452      0.431      0.667      -0.692       1.081
L1.TCSG.ME.SBER        0.0200      0.220      0.091      0.927      -0.411       0.451
L1.e(SBER.ME).SBER     0.0333      0.484      0.069      0.945      -0.915       0.982
L1.e(VTBR.ME).SBER     0.0436      0.460      0.095      0.925      -0.858       0.945
L1.e(TCSG.ME).SBER     0.0237      0.210      0.113      0.910      -0.388       0.435
beta.BRN-USD.SBER     -0.0191      0.031     -0.616      0.538      -0.080       0.042
beta.IMOEX.ME.SBER     1.2744      0.141      9.028      0.000       0.998       1.551
                             Results for equation VTBR.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.VTBR     -7.402e-05      0.001     -0.062      0.951      -0.002       0.002
L1.SBER.ME.VTBR        0.0528      0.397      0.133      0.894      -0.726       0.832
L1.VTBR.ME.VTBR        0.0683      0.396      0.173      0.863      -0.707       0.844
L1.TCSG.ME.VTBR        0.0144      0.250      0.058      0.954      -0.475       0.504
L1.e(SBER.ME).VTBR     0.0221      0.413      0.054      0.957      -0.786       0.831
L1.e(VTBR.ME).VTBR    -0.0354      0.426     -0.083      0.934      -0.871       0.800
L1.e(TCSG.ME).VTBR    -0.0154      0.249     -0.062      0.951      -0.504       0.473
beta.BRN-USD.VTBR     -0.0090      0.027     -0.337      0.736      -0.061       0.043
beta.IMOEX.ME.VTBR     1.1525      0.129      8.941      0.000       0.900       1.405
                             Results for equation TCSG.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.TCSG         0.0007      0.003      0.226      0.821      -0.005       0.007
L1.SBER.ME.TCSG       -0.4424      0.897     -0.493      0.622      -2.200       1.315
L1.VTBR.ME.TCSG        0.5069      0.851      0.596      0.551      -1.161       2.174
L1.TCSG.ME.TCSG        0.0603      0.566      0.106      0.915      -1.050       1.170
L1.e(SBER.ME).TCSG     0.2572      0.927      0.277      0.781      -1.560       2.074
L1.e(VTBR.ME).TCSG     0.1114      0.922      0.121      0.904      -1.696       1.918
L1.e(TCSG.ME).TCSG     0.0543      0.582      0.093      0.926      -1.086       1.194
beta.BRN-USD.TCSG      0.0500      0.059      0.844      0.399      -0.066       0.166
beta.IMOEX.ME.TCSG     0.7670      0.309      2.479      0.013       0.161       1.374
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.SBER.ME             0.0092      0.001     10.791      0.000       0.008       0.011
sqrt.cov.SBER.ME.VTBR.ME     0.0024      0.001      1.657      0.097      -0.000       0.005
sqrt.var.VTBR.ME             0.0095      0.001      9.418      0.000       0.008       0.011
sqrt.cov.SBER.ME.TCSG.ME    -0.0007      0.003     -0.246      0.806      -0.007       0.005
sqrt.cov.VTBR.ME.TCSG.ME    -0.0031      0.003     -1.098      0.272      -0.009       0.002
sqrt.var.TCSG.ME             0.0202      0.002      8.521      0.000       0.016       0.025
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Сравнение моделей¶

Возьмем скользяющую ошибку RMSE

In [41]:
# rolling_var_evaluation.py
import os
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.var_model import VAR
from sklearn.metrics import mean_squared_error, mean_absolute_error
from tqdm.auto import tqdm  # если не установлено, можно убрать tqdm

# ============= SETTINGS =============
SECTOR_FILES = {
    "rus_fin": "rus_fin_prices.csv",
    "rus_oil": "rus_oil_prices.csv",
    "rus_met": "rus_met_prices.csv",
    "us_tech": "us_tech_prices.csv",
    "us_retail": "us_retail_prices.csv",
    "us_energy": "us_energy_prices.csv"
}
OUT_DIR = "rolling_results"
os.makedirs(OUT_DIR, exist_ok=True)

MIN_TRAIN_SIZE = 250     # минимальный размер тренировки до старта rolling (в рабочих днях)
HOLDOUT_DAYS = 200       # сколько дней оценивать (длина rolling / теста)
MAX_LAGS = 8             # макс лагов для select_order (на первом шаге)
FIX_LAG = None           # если хочешь принудительно задать lag -> укажи целое (напр. 2), иначе None
USE_EXPANDING = True     # True -> expanding window; False -> moving window
ROLLING_WINDOW_SIZE = 500  # если USE_EXPANDING=False, размер окна

# ============= FUNCTIONS =============
def prepare_returns(df_prices):
    df_prices = df_prices.sort_index()
    returns = np.log(df_prices).diff().dropna(how='all')
    returns = returns.dropna(how='any')  # VAR требует полных строк
    return returns

def fit_var_and_forecast_one_step(train_returns, lag):
    """
    Fit VAR(lag) on train_returns (DataFrame) and forecast 1 step ahead (returns).
    Returns predicted 1-step array (shape = n_vars,)
    """
    model = VAR(train_returns)
    res = model.fit(lag)
    # use last k_ar observations from train_returns
    y_input = train_returns.values[-res.k_ar:]
    fc = res.forecast(y=y_input, steps=1)  # returns array shape (1, nvars)
    return fc.ravel(), res

# ============= MAIN LOOP per sector =============
overall_summary = []

for sector, file in SECTOR_FILES.items():
    print("\n" + "="*80)
    print("Sector:", sector)
    print("="*80)

    # load prices (levels)
    df_prices = pd.read_csv(file, index_col=0, parse_dates=True)
    df_prices = df_prices.sort_index()
    # prepare returns
    df_ret = prepare_returns(df_prices)

    n = df_ret.shape[0]
    if n < MIN_TRAIN_SIZE + 10:
        print(f"Not enough obs for sector {sector}: {n} < MIN_TRAIN_SIZE + 10, skipping.")
        continue

    # define start index for rolling: leave last HOLDOUT_DAYS for evaluation OR use MIN_TRAIN_SIZE
    start_idx = max(0, n - HOLDOUT_DAYS - MIN_TRAIN_SIZE)
    # we'll iterate from train_end = start_idx + MIN_TRAIN_SIZE - 1 up to n-2 (so that 1-step ahead exists)
    first_train_end = start_idx + MIN_TRAIN_SIZE - 1
    last_train_end = n - 2  # last index position where 1-step ahead exists

    train_end_positions = list(range(first_train_end, last_train_end + 1))
    m = len(train_end_positions)
    print(f"Total rolling steps: {m} (from pos {first_train_end} to {last_train_end})")

    # determine lag: either FIX_LAG or select_order on the initial training window
    if FIX_LAG is not None:
        chosen_lag = int(FIX_LAG)
        print("Using fixed lag:", chosen_lag)
    else:
        # initial training window
        init_train = df_ret.iloc[: first_train_end + 1]  # inclusive
        try:
            sel = VAR(init_train).select_order(MAX_LAGS)
            chosen_lag = sel.selected_orders.get('aic', None) or sel.selected_orders.get('bic', None) or 1
            if pd.isna(chosen_lag):
                chosen_lag = 1
            chosen_lag = int(chosen_lag)
        except Exception as e:
            print("select_order failed on init window, fallback to lag=1. Error:", e)
            chosen_lag = 1
        print("Chosen lag from initial selection:", chosen_lag)

    # storage for predictions and true values
    cols = df_ret.columns.tolist()
    preds = pd.DataFrame(index=[df_ret.index[i+1] for i in train_end_positions], columns=cols, dtype=float)
    trues = pd.DataFrame(index=preds.index, columns=cols, dtype=float)

    # rolling loop
    for i, train_end_pos in enumerate(tqdm(train_end_positions, desc=f"Rolling {sector}")):
        # define train window indices
        if USE_EXPANDING:
            train_start_pos = 0
        else:
            train_start_pos = max(0, train_end_pos - (ROLLING_WINDOW_SIZE - 1))

        train_df = df_ret.iloc[train_start_pos: train_end_pos + 1]  # inclusive
        test_pos = train_end_pos + 1
        test_date = df_ret.index[test_pos]

        # ensure enough observations to fit VAR(chosen_lag)
        if train_df.shape[0] <= chosen_lag:
            # not enough obs, fill NaN and continue
            preds.loc[test_date] = np.nan
            trues.loc[test_date] = df_ret.iloc[test_pos].values
            continue

        try:
            yhat, fitted = fit_var_and_forecast_one_step(train_df, chosen_lag)
            preds.loc[test_date] = yhat
            trues.loc[test_date] = df_ret.iloc[test_pos].values
        except Exception as e:
            # on error, log NaNs
            print(f"Step {i} error at date {test_date}: {e}")
            preds.loc[test_date] = np.nan
            trues.loc[test_date] = df_ret.iloc[test_pos].values

    # drop rows where preds are all NaN (failed steps)
    valid_mask = ~preds.isna().all(axis=1)
    preds = preds.loc[valid_mask]
    trues = trues.loc[valid_mask]

    # compute per-ticker RMSE/MAE
    rmse_per = ((preds - trues)**2).mean().apply(np.sqrt)
    mae_per = (preds - trues).abs().mean()
    summary_df = pd.DataFrame({
        "RMSE": rmse_per,
        "MAE": mae_per
    })
    summary_df.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_errors_per_ticker.csv"))

    # overall aggregated metrics (mean across tickers)
    overall_rmse = rmse_per.mean()
    overall_mae = mae_per.mean()

    # save preds and trues
    preds.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_preds.csv"))
    trues.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_true.csv"))
    summary_df.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_summary.csv"))

    # Save overall summary row
    overall_summary.append({
        "sector": sector,
        "n_steps": preds.shape[0],
        "chosen_lag": chosen_lag,
        "overall_RMSE_mean": overall_rmse,
        "overall_MAE_mean": overall_mae
    })

    # Plot aggregated errors over time (mean absolute error across tickers per date)
    mad_series = (preds - trues).abs().mean(axis=1)
    rmse_series = np.sqrt(((preds - trues)**2).mean(axis=1))

    # plot MAE over time
    import matplotlib.pyplot as plt
    fig, ax = plt.subplots(1,2, figsize=(14,4))
    ax[0].plot(mad_series.index, mad_series.values)
    ax[0].set_title(f"{sector} - Mean Absolute Error (one-step) over time")
    ax[0].set_xlabel("Date")
    ax[0].set_ylabel("MAE")
    ax[0].grid(True)

    ax[1].plot(rmse_series.index, rmse_series.values)
    ax[1].set_title(f"{sector} - RMSE (one-step) over time")
    ax[1].set_xlabel("Date")
    ax[1].set_ylabel("RMSE")
    ax[1].grid(True)

    plt.tight_layout()
    fig_path = os.path.join(OUT_DIR, f"{sector}_rolling_error_timeseries.png")
    fig.savefig(fig_path, bbox_inches='tight', dpi=150)
    plt.show()
    plt.close(fig)

# Save overall summary
overall_df = pd.DataFrame(overall_summary)
overall_df.to_csv(os.path.join(OUT_DIR, "rolling_overall_summary.csv"), index=False)
print("\nRolling evaluation completed. Results are in folder:", OUT_DIR)
================================================================================
Sector: rus_fin
================================================================================
Total rolling steps: 178 (from pos 249 to 426)
Chosen lag from initial selection: 7
Rolling rus_fin: 100%|██████████| 178/178 [00:01<00:00, 139.18it/s]
No description has been provided for this image
================================================================================
Sector: rus_oil
================================================================================
Total rolling steps: 200 (from pos 1414 to 1613)
Chosen lag from initial selection: 1
Rolling rus_oil: 100%|██████████| 200/200 [00:01<00:00, 153.86it/s]
No description has been provided for this image
================================================================================
Sector: rus_met
================================================================================
Total rolling steps: 200 (from pos 1414 to 1613)
Chosen lag from initial selection: 1
Rolling rus_met: 100%|██████████| 200/200 [00:02<00:00, 97.75it/s] 
No description has been provided for this image
================================================================================
Sector: us_tech
================================================================================
Total rolling steps: 200 (from pos 2312 to 2511)
Chosen lag from initial selection: 8
Rolling us_tech: 100%|██████████| 200/200 [00:08<00:00, 24.25it/s]
No description has been provided for this image
================================================================================
Sector: us_retail
================================================================================
Total rolling steps: 200 (from pos 2312 to 2511)
Chosen lag from initial selection: 1
Rolling us_retail: 100%|██████████| 200/200 [00:02<00:00, 87.40it/s]
No description has been provided for this image
================================================================================
Sector: us_energy
================================================================================
Total rolling steps: 200 (from pos 2312 to 2511)
Chosen lag from initial selection: 7
Rolling us_energy: 100%|██████████| 200/200 [00:06<00:00, 29.06it/s]
No description has been provided for this image
Rolling evaluation completed. Results are in folder: rolling_results

Сделаем прогноз в будущее

In [42]:
import os
import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VAR
In [43]:
# ---------------- SETTINGS (подправь пути при необходимости) ----------------
SECTORS = {
    "rus_fin": "rus_fin_prices.csv",
    "rus_oil": "rus_oil_prices.csv",
    "rus_met": "rus_met_prices.csv",
    "us_tech": "us_tech_prices.csv",
    "us_retail": "us_retail_prices.csv",
    "us_energy": "us_energy_prices.csv"
}
OUT_DIR = "FORECASTS"
OUT_VAR = os.path.join(OUT_DIR, "VAR")
os.makedirs(OUT_VAR, exist_ok=True)

# параметры по умолчанию
HOLDOUT_DAYS = 60      # размер тестовой части (backtest). Поменяй как нужно
MAX_LAGS_VAR = 8
DEFAULT_VAR_LAG = 2
PLOT_XLIM_LEFT = pd.to_datetime("2005-03-01")  # пример из твоего фрагмента

# ---------------- utility functions ----------------
def ensure_numeric_df(df):
    df2 = df.copy()
    for c in df2.columns:
        df2[c] = pd.to_numeric(df2[c], errors='coerce')
    df2 = df2.dropna(axis=1, how='all')
    return df2

def compute_log_returns(df_prices):
    return np.log(df_prices).diff().dropna(how='any')

def make_future_index(last_index, h):
    start = last_index[-1] + pd.Timedelta(days=1)
    try:
        idx = pd.bdate_range(start=start, periods=h)
    except Exception:
        idx = pd.date_range(start=start, periods=h, freq='B')
    return idx

def safe_reconstruct_prices_from_log_returns(last_prices, fc_returns_df):
    """
    last_prices: pd.Series (last observed prices)
    fc_returns_df: DataFrame of forecasted log-returns (index = future dates)
    returns DataFrame of forecasted prices
    """
    common = [c for c in fc_returns_df.columns if c in last_prices.index]
    if len(common) == 0:
        raise ValueError("No matching tickers between last_prices and fc_returns_df")
    if len(common) < len(fc_returns_df.columns):
        missing = [c for c in fc_returns_df.columns if c not in common]
        print("⚠ Missing last_prices for tickers:", missing)
    fc = fc_returns_df[common].copy()
    prev = last_prices[common].astype(float).copy()
    prices = []
    for i in range(len(fc)):
        r = fc.iloc[i]
        next_price = prev * np.exp(r)
        prices.append(next_price.copy())
        prev = next_price
    return pd.DataFrame(prices, index=fc.index, columns=fc.columns)

def var_forecast_plot_prices(sector, csv_path, holdout_days=60, maxlags=8):

    print(f"=== SECTOR: {sector} ===")

    # ---------- LOAD ----------
    df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True)
    df_prices = df_prices.sort_index()
    df_prices = df_prices.apply(pd.to_numeric, errors="coerce").dropna()

    # ---------- RETURNS ----------
    df_ret = np.log(df_prices).diff().dropna()

    # ---------- TRAIN / TEST ----------
    train_ret = df_ret.iloc[:-holdout_days]
    test_ret  = df_ret.iloc[-holdout_days:]

    train_prices = df_prices.loc[train_ret.index]
    test_prices  = df_prices.loc[test_ret.index]

    last_price = train_prices.iloc[-1]

    # ---------- FIT VAR ----------
    model = VAR(train_ret)
    try:
        sel = model.select_order(maxlags)
        lag = sel.selected_orders.get('aic') or sel.selected_orders.get('bic') or 2
        lag = int(lag)
    except:
        lag = 2

    print("Chosen lag:", lag)

    res = model.fit(lag)

    # ---------- FORECAST RETURNS ----------
    steps = len(test_ret)
    fc_ret = res.forecast(train_ret.values[-res.k_ar:], steps=steps)

    fc_ret_df = pd.DataFrame(fc_ret, index=test_ret.index, columns=train_ret.columns)

    # ---------- CONVERT RETURNS BACK TO PRICES ----------
    fc_prices = []
    prev = last_price.copy()

    for t in range(steps):
        new_price = prev * np.exp(fc_ret_df.iloc[t])
        fc_prices.append(new_price)
        prev = new_price

    fc_prices_df = pd.DataFrame(fc_prices, index=test_ret.index, columns=train_ret.columns)

    # ---------- PLOT (like your example) ----------
    for col in train_ret.columns:
        fig, ax = plt.subplots(figsize=(15, 2))

        # История
        train_prices[col].plot(ax=ax, color="black")

        # Прогноз
        fc_prices_df[col].rename(col+"-VAR").plot(ax=ax, linestyle="--", color="blue")

        # Склейка
        ax.scatter([train_prices.index[-1]], [train_prices[col].iloc[-1]], color="red", s=30)

        ax.set_xlim(left=pd.to_datetime("2005-03-01"))
        ax.set_title(f"{sector} — {col} | VAR forecast")
        ax.grid(True)
        plt.show()

    return {
        "train": train_prices,
        "test": test_prices,
        "fc_prices": fc_prices_df,
        "fc_returns": fc_ret_df,
        "var_model": res
    }

# ---------------- core: train/forecast/plot for one sector ----------------
def var_forecast_and_plot(sector_name, csv_path,
                          holdout_days=HOLDOUT_DAYS,
                          maxlags=MAX_LAGS_VAR,
                          default_lag=DEFAULT_VAR_LAG,
                          save_out=True,
                          show_plots=True,
                          xlim_left=PLOT_XLIM_LEFT):
    """
    1) Reads csv_path (prices levels)
    2) Builds log-returns, splits into train/test (last holdout_days rows -> test)
    3) Fits VAR on train returns, selects lag by AIC (if possible)
    4) Forecasts returns for len(test) steps
    5) Reconstructs forecast prices, saves CSVs and plots per-ticker
    """
    # -- load and prepare --
    df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True)
    df_prices = ensure_numeric_df(df_prices).sort_index()
    if df_prices.shape[1] < 2:
        print(f"Sector {sector_name}: need >=2 series, got {df_prices.shape[1]}. Skipping.")
        return None

    # compute returns
    df_ret = compute_log_returns(df_prices)
    if df_ret.shape[0] < 30:
        print(f"Sector {sector_name}: too few returns observations ({df_ret.shape[0]}). Skipping.")
        return None

    # split train / test by last holdout_days
    if holdout_days is None or holdout_days <= 0 or holdout_days >= df_ret.shape[0]:
        # use default small test (10) if invalid
        holdout_days = min(10, max(1, df_ret.shape[0]//10))

    train_ret = df_ret.iloc[:-holdout_days]
    test_ret  = df_ret.iloc[-holdout_days:]

    # also get train prices (levels) for plotting and last_price
    train_prices = df_prices.loc[train_ret.index]
    test_prices = df_prices.loc[test_ret.index]
    last_price_for_reconstruct = train_prices.iloc[-1]

    # -- select lag --
    try:
        sel = VAR(train_ret).select_order(maxlags)
        chosen = sel.selected_orders.get('aic') or sel.selected_orders.get('bic') or default_lag
        lag = int(chosen) if (pd.notna(chosen)) else default_lag
        if lag < 1:
            lag = default_lag
    except Exception:
        lag = default_lag
    print(f"{sector_name}: chosen VAR lag = {lag}")

    # -- fit VAR --
    model = VAR(train_ret)
    try:
        var_res = model.fit(lag)
    except Exception as e:
        print(f"{sector_name}: VAR fit failed: {e}")
        return None

    # -- forecast returns for test horizon --
    steps = len(test_ret)
    fc_array = var_res.forecast(y=train_ret.values[-var_res.k_ar:], steps=steps)
    # align index with test_ret index (same as example)
    fc_returns = pd.DataFrame(fc_array, index=test_ret.index, columns=train_ret.columns)
    fc_returns = fc_returns.rename(columns={c: c + "-VAR" for c in fc_returns.columns})

    # -- reconstruct forecast prices from last train price --
    # NOTE: fc_returns columns currently have suffix '-VAR'; create mapping back to original
    orig_cols = [c.replace("-VAR", "") for c in fc_returns.columns]
    fc_returns_no_suffix = fc_returns.copy()
    fc_returns_no_suffix.columns = orig_cols

    fc_prices = safe_reconstruct_prices_from_log_returns(last_price_for_reconstruct, fc_returns_no_suffix)

    # add suffix to price columns to save
    fc_prices.columns = [c + "-VAR" for c in fc_prices.columns]

    # -- save CSVs --
    if save_out:
        base = f"{sector_name}_VAR_h{steps}"
        fc_returns.to_csv(os.path.join(OUT_VAR, base + "_returns.csv"))
        fc_prices.to_csv(os.path.join(OUT_VAR, base + "_prices.csv"))
        print(f"Saved: {os.path.join(OUT_VAR, base + '_returns.csv')}")
        print(f"Saved: {os.path.join(OUT_VAR, base + '_prices.csv')}")

    # -- plotting (per-ticker) --
    # We'll plot history = train_prices (full train), then dashed forecast starting from the next date after last train price.
    for orig_col in train_ret.columns:
        col_var = orig_col + "-VAR"
        fig, ax = plt.subplots(figsize=(15, 2))
        # combine: show train_prices (levels) and forecast prices (levels)
        # build a concat DataFrame with column names: orig_col (history) and col_var (forecast)
        # to match your example, we plot train (history) and forecast (dashed)
        hist_series = pd.DataFrame(train_prices[[orig_col]])
        fc_series = pd.DataFrame(fc_prices[[col_var]])
        combined = pd.concat([hist_series, fc_series], axis=1)

        # plot
        combined.plot(ax=ax, legend=False)
        # set xlim if possible
        try:
            ax.set_xlim(left=xlim_left)
        except Exception:
            pass
        ax.set_xlabel("")
        ax.set_title(f"{sector_name} — {orig_col} — VAR forecast (h={steps})")
        # tighten visually
        plt.tight_layout()
        if show_plots:
            plt.show()
        # save per-plot
        fname = os.path.join(OUT_VAR, f"{sector_name}_{orig_col}_VAR_h{steps}.png")
        fig.savefig(fname, bbox_inches='tight', dpi=150)
        plt.close(fig)

    # Also return dict with results
    return {"var_res": var_res, "train_ret": train_ret, "test_ret": test_ret,
            "fc_returns": fc_returns, "fc_prices": fc_prices, "lag": lag}
In [44]:
for s, path in SECTORS.items():
    print("Running sector:", s)
    var_forecast_plot_prices(s, path, holdout_days=HOLDOUT_DAYS)
Running sector: rus_fin
=== SECTOR: rus_fin ===
Chosen lag: 4
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Running sector: rus_oil
=== SECTOR: rus_oil ===
Chosen lag: 1
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Running sector: rus_met
=== SECTOR: rus_met ===
Chosen lag: 2
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Running sector: us_tech
=== SECTOR: us_tech ===
Chosen lag: 1
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Running sector: us_retail
=== SECTOR: us_retail ===
Chosen lag: 2
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Running sector: us_energy
=== SECTOR: us_energy ===
Chosen lag: 7
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Не прошла, попробуем снова)

In [45]:
import numpy as np
import pandas as pd
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.vector_ar.vecm import VECM, coint_johansen
from statsmodels.tsa.statespace.varmax import VARMAX
from sklearn.metrics import mean_squared_error, mean_absolute_error
import matplotlib.pyplot as plt

# -------------------- HELPER UTILS --------------------

def compute_returns(df_prices):
    return np.log(df_prices).diff().dropna()

def reconstruct_prices(last_price, forecast_returns):
    """Преобразование лог-доходностей в цены по шагам."""
    prices = []
    prev = last_price.copy()
    for t in range(len(forecast_returns)):
        prev = prev * np.exp(forecast_returns.iloc[t])
        prices.append(prev.copy())
    return pd.DataFrame(prices, index=forecast_returns.index, columns=forecast_returns.columns)

def future_index(start_date, steps):
    """Создаёт h будущих торговых дней."""
    return pd.bdate_range(start=start_date + pd.Timedelta(days=1), periods=steps)

def rmse(y_true, y_pred):
    return np.sqrt(mean_squared_error(y_true, y_pred))

def mae(y_true, y_pred):
    return mean_absolute_error(y_true, y_pred)

def monte_carlo_VAR(var_res, last_price, h, returns_train, n_sims=200):
    """
    Выполняет Monte Carlo симуляцию VAR:
    - вытягивает средний прогноз
    - добавляет шум N(0, Σ_u)
    - реконструирует N траекторий цен
    """
    Σ = var_res.sigma_u
    k = Σ.shape[0]

    mean_fc = var_res.forecast(returns_train.values[-var_res.k_ar:], steps=h)
    mean_fc_df = pd.DataFrame(mean_fc, index=future_index(returns_train.index[-1], h), columns=returns_train.columns)

    sims = np.zeros((n_sims, h, k))
    last = last_price.values

    for s in range(n_sims):
        path = last.copy()
        for t in range(h):
            shock = np.random.multivariate_normal(np.zeros(k), Σ)
            r = mean_fc[t] + shock
            path = path * np.exp(r)
            sims[s,t,:] = path
    return mean_fc_df, sims

def summarize_mc_paths(sims, index, columns):
    """Возвращает mean, median, 5%, 95% пути."""
    mean = sims.mean(axis=0)
    median = np.median(sims, axis=0)
    low = np.quantile(sims, 0.05, axis=0)
    high = np.quantile(sims, 0.95, axis=0)

    return (
        pd.DataFrame(mean, index=index, columns=columns),
        pd.DataFrame(median, index=index, columns=columns),
        pd.DataFrame(low, index=index, columns=columns),
        pd.DataFrame(high, index=index, columns=columns),
    )
In [46]:
def run_VAR(df_prices, train_ret, test_ret):
    # lag selection
    sel = VAR(train_ret).select_order(8)
    lag = sel.selected_orders.get("aic") or sel.selected_orders.get("bic") or 2
    lag = int(lag)

    var_model = VAR(train_ret).fit(lag)

    # ----------- SUMMARY OUTPUT -----------
    print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
    try:
        print(var_model.summary().as_text())
    except:
        print(var_model.summary())
    # --------------------------------------

    # forecast returns
    h = len(test_ret)
    fc_ret = var_model.forecast(train_ret.values[-var_model.k_ar:], steps=h)
    fc_ret_df = pd.DataFrame(fc_ret, index=test_ret.index, columns=train_ret.columns)

    # prices
    fc_price = reconstruct_prices(df_prices.iloc[-h-1], fc_ret_df)

    # Monte Carlo
    mean_fc_df, sims = monte_carlo_VAR(var_model, df_prices.iloc[-h-1], h, train_ret)
    mean_mc, med_mc, low_mc, high_mc = summarize_mc_paths(sims, mean_fc_df.index, mean_fc_df.columns)

    return {
        "model": var_model,
        "forecast_returns": fc_ret_df,
        "forecast_prices": fc_price,
        "mc_mean": mean_mc,
        "mc_low": low_mc,
        "mc_high": high_mc,
        "mc_sims": sims,
        # "lag": lag
    }

def run_VECM(df_prices, train_ret, test_ret):
    """
    Прогноз VECM с автоматической защитой от ошибок размерности.
    Использует только встроенный predict(), без ручной математики alpha*beta.
    """

    # уровни (не returns!)
    train_lvl = df_prices.loc[train_ret.index]
    test_lvl  = df_prices.loc[test_ret.index]

    # ---------- Johansen тест ----------
    try:
        joh = coint_johansen(train_lvl, det_order=0, k_ar_diff=1)
    except Exception as e:
        print("VECM: Johansen failed:", e)
        return {"model": None}

    trace = joh.lr1
    cvt = joh.cvt

    r = sum(trace > cvt[:,1])  # rank at 95% level

    if r <= 0:
        print("VECM: No cointegration (rank = 0). Skipping.")
        return {"model": None}

    if r >= train_lvl.shape[1]:
        print(f"VECM: Invalid rank r={r}, k={train_lvl.shape[1]}. Skipping.")
        return {"model": None}

    # ---------- Fit VECM ----------
    try:
        vecm = VECM(train_lvl, k_ar_diff=1, coint_rank=r)
        res = vecm.fit()
    except Exception as e:
        print("VECM fit failed:", e)
        return {"model": None}
    
    # ----------- SUMMARY OUTPUT -----------
    print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
    try:
        print(res.summary().as_text())
    except:
        print(res.summary())
    # --------------------------------------

    # ---------- Forecast ----------
    h = len(test_ret)
    try:
        fc = res.predict(steps=h)
    except Exception as e:
        print("VECM forecast failed:", e)
        return {"model": None}

    idx = test_lvl.index
    fc_df = pd.DataFrame(fc, index=idx, columns=train_lvl.columns)

    # ---------- Monte Carlo ----------
    # упрощённый вариант: просто добавляем небольшие shocks
    resid = res.resid
    Σ = np.cov(resid.T)

    sims = np.zeros((200, h, train_lvl.shape[1]))
    for s in range(200):
        prev = train_lvl.iloc[-1].values.copy()
        for t in range(h):
            eps = np.random.multivariate_normal(np.zeros(train_lvl.shape[1]), Σ)
            prev = fc_df.iloc[t].values + eps
            sims[s,t,:] = prev

    mean_mc = sims.mean(axis=0)
    mean_mc_df = pd.DataFrame(mean_mc, index=idx, columns=train_lvl.columns)

    return {
        "model": res,
        "forecast_prices": fc_df,
        "mc_mean": mean_mc_df,
        "mc_sims": sims
    }

def run_VARMAX(df_prices, train_ret, test_ret):
    try:
        # простая модель без экзогенов
        model = VARMAX(train_ret, order=(1,1))
        res = model.fit(disp=False)
    except:
        return {"model": None}

    # ----------- SUMMARY OUTPUT -----------
    print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
    try:
        print(res.summary().as_text())
    except:
        print(res.summary())
    # --------------------------------------

    h = len(test_ret)
    fc_ret = res.forecast(steps=h)
    fc_ret.index = test_ret.index

    fc_price = reconstruct_prices(df_prices.iloc[-h-1], fc_ret)

    # MC: шум = ковариация ошибок
    resid = res.resid.dropna()
    Σ = np.cov(resid.T)

    sims = np.zeros((200, h, train_ret.shape[1]))

    for s in range(200):
        prev = df_prices.iloc[-h-1].values.copy()
        for t in range(h):
            eps = np.random.multivariate_normal(np.zeros(train_ret.shape[1]), Σ)
            r = fc_ret.iloc[t].values + eps
            prev = prev * np.exp(r)
            sims[s,t,:] = prev

    mean_mc = sims.mean(axis=0)
    mean_mc_df = pd.DataFrame(mean_mc, index=test_ret.index, columns=train_ret.columns)

    return {
        "model": res,
        "forecast_prices": fc_price,
        "mc_mean": mean_mc_df,
        "mc_sims": sims
    }

def compare_models(sector, df_prices, holdout_days=60):
    print(f"\n=== {sector.upper()} ===")

    df_prices = df_prices.apply(pd.to_numeric, errors="coerce").dropna()

    df_ret = compute_returns(df_prices)

    train = df_ret.iloc[:-holdout_days]
    test  = df_ret.iloc[-holdout_days:]

    # ---------------- VAR ----------------
    VAR_res = run_VAR(df_prices, train, test)

    # ---------------- VECM ----------------
    VECM_res = run_VECM(df_prices, train, test)

    # ---------------- VARMAX ----------------
    VARMAX_res = run_VARMAX(df_prices, train, test)

    # ---------------- ERROR METRICS ----------------
    metrics = {}      

# VAR
    if VAR_res.get("forecast_prices") is not None:
        y_true = df_prices.loc[VAR_res["forecast_prices"].index]
        y_pred = VAR_res["forecast_prices"]
        metrics["VAR_RMSE"] = rmse(y_true, y_pred)
        metrics["VAR_MAE"] = mae(y_true, y_pred)
    else:
        metrics["VAR_RMSE"] = None
        metrics["VAR_MAE"] = None

    # VECM
    if VECM_res.get("forecast_prices") is not None:
        y_true = df_prices.loc[VECM_res["forecast_prices"].index]
        y_pred = VECM_res["forecast_prices"]
        metrics["VECM_RMSE"] = rmse(y_true, y_pred)
        metrics["VECM_MAE"] = mae(y_true, y_pred)
    else:
        metrics["VECM_RMSE"] = None
        metrics["VECM_MAE"] = None

    # VARMAX
    if VARMAX_res.get("forecast_prices") is not None:
        y_true = df_prices.loc[VARMAX_res["forecast_prices"].index]
        y_pred = VARMAX_res["forecast_prices"]
        metrics["VARMAX_RMSE"] = rmse(y_true, y_pred)
        metrics["VARMAX_MAE"] = mae(y_true, y_pred)
    else:
        metrics["VARMAX_RMSE"] = None
        metrics["VARMAX_MAE"] = None

    return {
        "VAR": VAR_res,
        "VECM": VECM_res,
        "VARMAX": VARMAX_res,
        "metrics": metrics
    }

Делаем прогнозы и строим графики

In [47]:
results = {}

for sector, file in SECTORS.items():
    df = pd.read_csv(file, index_col=0, parse_dates=True)
    results[sector] = compare_models(sector, df, holdout_days=60)
=== RUS_FIN ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:09:15
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -23.0519
Nobs:                     364.000    HQIC:                  -23.3035
Log likelihood:           2760.97    FPE:                6.41754e-11
AIC:                     -23.4695    Det(Omega_mle):     5.77628e-11
--------------------------------------------------------------------
Results for equation SBER.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const             -0.000953         0.001161           -0.821           0.412
L1.SBER.ME        -0.120067         0.086005           -1.396           0.163
L1.VTBR.ME         0.118705         0.091821            1.293           0.196
L1.TCSG.ME        -0.010379         0.042614           -0.244           0.808
L2.SBER.ME        -0.053712         0.085561           -0.628           0.530
L2.VTBR.ME        -0.110885         0.090613           -1.224           0.221
L2.TCSG.ME         0.051327         0.042311            1.213           0.225
L3.SBER.ME        -0.125822         0.085604           -1.470           0.142
L3.VTBR.ME        -0.034478         0.090393           -0.381           0.703
L3.TCSG.ME         0.167684         0.042923            3.907           0.000
L4.SBER.ME         0.227856         0.084188            2.707           0.007
L4.VTBR.ME        -0.166232         0.089140           -1.865           0.062
L4.TCSG.ME        -0.010293         0.044614           -0.231           0.818
=============================================================================

Results for equation VTBR.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const             -0.001037         0.001032           -1.005           0.315
L1.SBER.ME        -0.054795         0.076427           -0.717           0.473
L1.VTBR.ME         0.195459         0.081596            2.395           0.017
L1.TCSG.ME         0.008296         0.037868            0.219           0.827
L2.SBER.ME         0.014093         0.076033            0.185           0.853
L2.VTBR.ME        -0.053074         0.080522           -0.659           0.510
L2.TCSG.ME        -0.013821         0.037599           -0.368           0.713
L3.SBER.ME         0.009426         0.076071            0.124           0.901
L3.VTBR.ME        -0.078054         0.080327           -0.972           0.331
L3.TCSG.ME         0.187902         0.038143            4.926           0.000
L4.SBER.ME         0.105043         0.074813            1.404           0.160
L4.VTBR.ME        -0.076017         0.079214           -0.960           0.337
L4.TCSG.ME         0.046019         0.039646            1.161           0.246
=============================================================================

Results for equation TCSG.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.001068         0.001729            0.618           0.537
L1.SBER.ME         0.014842         0.128082            0.116           0.908
L1.VTBR.ME         0.143472         0.136744            1.049           0.294
L1.TCSG.ME        -0.036270         0.063463           -0.572           0.568
L2.SBER.ME        -0.063421         0.127421           -0.498           0.619
L2.VTBR.ME        -0.101289         0.134945           -0.751           0.453
L2.TCSG.ME        -0.019129         0.063011           -0.304           0.761
L3.SBER.ME        -0.233042         0.127485           -1.828           0.068
L3.VTBR.ME         0.114846         0.134617            0.853           0.394
L3.TCSG.ME         0.254861         0.063923            3.987           0.000
L4.SBER.ME         0.288242         0.125377            2.299           0.022
L4.VTBR.ME        -0.074529         0.132752           -0.561           0.575
L4.TCSG.ME        -0.075576         0.066441           -1.137           0.255
=============================================================================

Correlation matrix of residuals
            SBER.ME   VTBR.ME   TCSG.ME
SBER.ME    1.000000  0.758710  0.539431
VTBR.ME    0.758710  1.000000  0.471629
TCSG.ME    0.539431  0.471629  1.000000




===== VAR SUMMARY () =====
Det. terms outside the coint. relation & lagged endog. parameters for equation SBER.ME
===================================================================================
                      coef    std err          z      P>|z|      [0.025      0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.SBER    -0.0022      0.092     -0.024      0.981      -0.183       0.179
L1.VTBR.ME.SBER    -0.0172      0.023     -0.752      0.452      -0.062       0.028
L1.TCSG.ME.SBER    -0.0026      0.004     -0.705      0.481      -0.010       0.005
Det. terms outside the coint. relation & lagged endog. parameters for equation VTBR.ME
===================================================================================
                      coef    std err          z      P>|z|      [0.025      0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.VTBR    -0.0359      0.368     -0.098      0.922      -0.758       0.686
L1.VTBR.ME.VTBR     0.1341      0.091      1.473      0.141      -0.044       0.312
L1.TCSG.ME.VTBR     0.0158      0.015      1.071      0.284      -0.013       0.045
Det. terms outside the coint. relation & lagged endog. parameters for equation TCSG.ME
===================================================================================
                      coef    std err          z      P>|z|      [0.025      0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.TCSG    -0.9703      3.163     -0.307      0.759      -7.170       5.230
L1.VTBR.ME.TCSG    -0.0012      0.781     -0.002      0.999      -1.533       1.530
L1.TCSG.ME.TCSG     0.0485      0.126      0.384      0.701      -0.199       0.296
              Loading coefficients (alpha) for equation SBER.ME               
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1.SBER       0.0009      0.035      0.027      0.978      -0.067       0.069
              Loading coefficients (alpha) for equation VTBR.ME               
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1.VTBR       0.1531      0.138      1.107      0.268      -0.118       0.424
              Loading coefficients (alpha) for equation TCSG.ME               
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1.TCSG       1.7600      1.187      1.483      0.138      -0.566       4.086
          Cointegration relations for loading-coefficients-column 1           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
beta.1         1.0000          0          0      0.000       1.000       1.000
beta.2        -0.1981      0.004    -44.854      0.000      -0.207      -0.189
beta.3        -0.0480      0.001    -91.424      0.000      -0.049      -0.047
==============================================================================

===== VAR SUMMARY () =====
                                   Statespace Model Results                                  
=============================================================================================
Dep. Variable:     ['SBER.ME', 'VTBR.ME', 'TCSG.ME']   No. Observations:                  368
Model:                                    VARMA(1,1)   Log Likelihood                2755.593
                                         + intercept   AIC                          -5457.186
Date:                               Sun, 23 Nov 2025   BIC                          -5351.668
Time:                                       21:11:00   HQIC                         -5415.265
Sample:                                            0                                         
                                               - 368                                         
Covariance Type:                                 opg                                         
========================================================================================
Ljung-Box (L1) (Q):       0.02, 0.00, 0.01   Jarque-Bera (JB):   387.28, 448.60, 3277.85
Prob(Q):                  0.89, 0.99, 0.92   Prob(JB):                  0.00, 0.00, 0.00
Heteroskedasticity (H):   0.81, 0.77, 0.59   Skew:                   -0.21, -0.35, -1.73
Prob(H) (two-sided):      0.25, 0.15, 0.00   Kurtosis:                 8.01, 8.36, 17.20
                             Results for equation SBER.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.SBER        -0.0006      0.002     -0.326      0.744      -0.004       0.003
L1.SBER.ME.SBER       -0.0274      0.644     -0.042      0.966      -1.290       1.235
L1.VTBR.ME.SBER        0.0185      0.465      0.040      0.968      -0.893       0.930
L1.TCSG.ME.SBER       -0.0396      0.606     -0.065      0.948      -1.227       1.148
L1.e(SBER.ME).SBER    -0.0382      0.651     -0.059      0.953      -1.314       1.238
L1.e(VTBR.ME).SBER     0.0542      0.466      0.116      0.907      -0.858       0.967
L1.e(TCSG.ME).SBER     0.0085      0.608      0.014      0.989      -1.183       1.200
                             Results for equation VTBR.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.VTBR        -0.0007      0.002     -0.405      0.686      -0.004       0.003
L1.SBER.ME.VTBR        0.0072      0.680      0.011      0.992      -1.326       1.340
L1.VTBR.ME.VTBR        0.1495      0.498      0.300      0.764      -0.827       1.126
L1.TCSG.ME.VTBR       -0.0084      0.616     -0.014      0.989      -1.216       1.199
L1.e(SBER.ME).VTBR    -0.0212      0.690     -0.031      0.976      -1.374       1.332
L1.e(VTBR.ME).VTBR     0.0233      0.519      0.045      0.964      -0.994       1.041
L1.e(TCSG.ME).VTBR     0.0013      0.615      0.002      0.998      -1.205       1.208
                             Results for equation TCSG.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.TCSG         0.0011      0.003      0.370      0.711      -0.005       0.007
L1.SBER.ME.TCSG        0.1434      1.116      0.129      0.898      -2.044       2.331
L1.VTBR.ME.TCSG       -0.0016      0.790     -0.002      0.998      -1.550       1.546
L1.TCSG.ME.TCSG       -0.0755      0.870     -0.087      0.931      -1.780       1.629
L1.e(SBER.ME).TCSG    -0.0615      1.109     -0.055      0.956      -2.235       2.112
L1.e(VTBR.ME).TCSG     0.0699      0.782      0.089      0.929      -1.462       1.602
L1.e(TCSG.ME).TCSG     0.0056      0.870      0.006      0.995      -1.699       1.710
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.SBER.ME             0.0223      0.001     31.555      0.000       0.021       0.024
sqrt.cov.SBER.ME.VTBR.ME     0.0152      0.001     19.924      0.000       0.014       0.017
sqrt.var.VTBR.ME             0.0130      0.000     31.441      0.000       0.012       0.014
sqrt.cov.SBER.ME.TCSG.ME     0.0188      0.002     10.547      0.000       0.015       0.022
sqrt.cov.VTBR.ME.TCSG.ME     0.0037      0.002      2.004      0.045    8.17e-05       0.007
sqrt.var.TCSG.ME             0.0273      0.001     42.064      0.000       0.026       0.029
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

=== RUS_OIL ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:11:03
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.5055
Nobs:                     1554.00    HQIC:                  -25.5315
Log likelihood:           13246.8    FPE:                8.03825e-12
AIC:                     -25.5468    Det(Omega_mle):     7.97649e-12
--------------------------------------------------------------------
Results for equation GAZP.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000693         0.000395            1.755           0.079
L1.GAZP.ME         0.087672         0.031871            2.751           0.006
L1.LKOH.ME        -0.024068         0.031368           -0.767           0.443
L1.ROSN.ME        -0.007666         0.032124           -0.239           0.811
=============================================================================

Results for equation LKOH.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000908         0.000452            2.008           0.045
L1.GAZP.ME        -0.004056         0.036521           -0.111           0.912
L1.LKOH.ME        -0.036476         0.035944           -1.015           0.310
L1.ROSN.ME         0.062906         0.036811            1.709           0.087
=============================================================================

Results for equation ROSN.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000559         0.000447            1.251           0.211
L1.GAZP.ME         0.007334         0.036067            0.203           0.839
L1.LKOH.ME         0.000547         0.035497            0.015           0.988
L1.ROSN.ME         0.107910         0.036353            2.968           0.003
=============================================================================

Correlation matrix of residuals
            GAZP.ME   LKOH.ME   ROSN.ME
GAZP.ME    1.000000  0.544161  0.564897
LKOH.ME    0.544161  1.000000  0.684490
ROSN.ME    0.564897  0.684490  1.000000



VECM: No cointegration (rank = 0). Skipping.

===== VAR SUMMARY () =====
                                   Statespace Model Results                                  
=============================================================================================
Dep. Variable:     ['GAZP.ME', 'LKOH.ME', 'ROSN.ME']   No. Observations:                 1555
Model:                                    VARMA(1,1)   Log Likelihood               13251.945
                                         + intercept   AIC                         -26449.890
Date:                               Sun, 23 Nov 2025   BIC                         -26305.461
Time:                                       21:12:53   HQIC                        -26396.182
Sample:                                            0                                         
                                              - 1555                                         
Covariance Type:                                 opg                                         
==========================================================================================
Ljung-Box (L1) (Q):       0.01, 0.00, 0.00   Jarque-Bera (JB):   5033.88, 12306.17, 595.48
Prob(Q):                  0.91, 1.00, 0.98   Prob(JB):                    0.00, 0.00, 0.00
Heteroskedasticity (H):   1.76, 2.02, 0.98   Skew:                       0.77, -0.85, 0.04
Prob(H) (two-sided):      0.00, 0.00, 0.78   Kurtosis:                  11.68, 16.68, 6.03
                             Results for equation GAZP.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.GAZP         0.0007      0.001      0.927      0.354      -0.001       0.002
L1.GAZP.ME.GAZP        0.0877      0.351      0.250      0.802      -0.599       0.775
L1.LKOH.ME.GAZP       -0.0241      0.757     -0.032      0.975      -1.509       1.461
L1.ROSN.ME.GAZP       -0.0077      0.371     -0.021      0.984      -0.735       0.720
L1.e(GAZP.ME).GAZP    -0.0001      0.354     -0.000      1.000      -0.693       0.693
L1.e(LKOH.ME).GAZP     0.0035      0.755      0.005      0.996      -1.476       1.483
L1.e(ROSN.ME).GAZP     0.0011      0.374      0.003      0.998      -0.732       0.734
                             Results for equation LKOH.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.LKOH         0.0009      0.001      1.074      0.283      -0.001       0.003
L1.GAZP.ME.LKOH       -0.0038      0.388     -0.010      0.992      -0.765       0.758
L1.LKOH.ME.LKOH       -0.0364      0.813     -0.045      0.964      -1.630       1.558
L1.ROSN.ME.LKOH        0.0629      0.328      0.192      0.848      -0.579       0.705
L1.e(GAZP.ME).LKOH    -0.0032      0.392     -0.008      0.993      -0.772       0.765
L1.e(LKOH.ME).LKOH    -0.0017      0.818     -0.002      0.998      -1.606       1.602
L1.e(ROSN.ME).LKOH     0.0083      0.336      0.025      0.980      -0.650       0.667
                             Results for equation ROSN.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.ROSN         0.0006      0.001      0.665      0.506      -0.001       0.002
L1.GAZP.ME.ROSN        0.0073      0.433      0.017      0.986      -0.841       0.856
L1.LKOH.ME.ROSN        0.0008      0.717      0.001      0.999      -1.404       1.406
L1.ROSN.ME.ROSN        0.1078      0.273      0.395      0.693      -0.427       0.642
L1.e(GAZP.ME).ROSN     0.0018      0.433      0.004      0.997      -0.847       0.851
L1.e(LKOH.ME).ROSN     0.0028      0.728      0.004      0.997      -1.424       1.429
L1.e(ROSN.ME).ROSN    -0.0001      0.286     -0.001      1.000      -0.560       0.560
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.GAZP.ME             0.0155      0.000     72.647      0.000       0.015       0.016
sqrt.cov.GAZP.ME.LKOH.ME     0.0097      0.000     32.821      0.000       0.009       0.010
sqrt.var.LKOH.ME             0.0149      0.000     92.313      0.000       0.015       0.015
sqrt.cov.GAZP.ME.ROSN.ME     0.0099      0.000     29.383      0.000       0.009       0.011
sqrt.cov.LKOH.ME.ROSN.ME     0.0079      0.000     33.171      0.000       0.007       0.008
sqrt.var.ROSN.ME             0.0122      0.000     72.829      0.000       0.012       0.012
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

=== RUS_MET ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:12:57
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.1237
Nobs:                     1553.00    HQIC:                  -25.1691
Log likelihood:           12974.9    FPE:                1.14161e-11
AIC:                     -25.1960    Det(Omega_mle):     1.12631e-11
--------------------------------------------------------------------
Results for equation NLMK.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.001232         0.000441            2.797           0.005
L1.NLMK.ME        -0.092896         0.032797           -2.832           0.005
L1.GMKN.ME         0.034962         0.027295            1.281           0.200
L1.CHMF.ME         0.084534         0.037290            2.267           0.023
L2.NLMK.ME        -0.081030         0.032754           -2.474           0.013
L2.GMKN.ME        -0.049538         0.027304           -1.814           0.070
L2.CHMF.ME         0.088404         0.037339            2.368           0.018
=============================================================================

Results for equation GMKN.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.000524         0.000445            1.180           0.238
L1.NLMK.ME         0.012946         0.033096            0.391           0.696
L1.GMKN.ME         0.042117         0.027544            1.529           0.126
L1.CHMF.ME         0.011866         0.037629            0.315           0.753
L2.NLMK.ME         0.009817         0.033053            0.297           0.766
L2.GMKN.ME        -0.039782         0.027553           -1.444           0.149
L2.CHMF.ME        -0.009004         0.037679           -0.239           0.811
=============================================================================

Results for equation CHMF.ME
=============================================================================
                coefficient       std. error           t-stat            prob
-----------------------------------------------------------------------------
const              0.001028         0.000396            2.596           0.009
L1.NLMK.ME         0.009470         0.029492            0.321           0.748
L1.GMKN.ME         0.036346         0.024544            1.481           0.139
L1.CHMF.ME        -0.063767         0.033532           -1.902           0.057
L2.NLMK.ME        -0.000296         0.029454           -0.010           0.992
L2.GMKN.ME        -0.023810         0.024552           -0.970           0.332
L2.CHMF.ME        -0.016002         0.033576           -0.477           0.634
=============================================================================

Correlation matrix of residuals
            NLMK.ME   GMKN.ME   CHMF.ME
NLMK.ME    1.000000  0.318221  0.629246
GMKN.ME    0.318221  1.000000  0.368201
CHMF.ME    0.629246  0.368201  1.000000



VECM: No cointegration (rank = 0). Skipping.

===== VAR SUMMARY () =====
                                   Statespace Model Results                                  
=============================================================================================
Dep. Variable:     ['NLMK.ME', 'GMKN.ME', 'CHMF.ME']   No. Observations:                 1555
Model:                                    VARMA(1,1)   Log Likelihood               12981.996
                                         + intercept   AIC                         -25909.992
Date:                               Sun, 23 Nov 2025   BIC                         -25765.563
Time:                                       21:13:59   HQIC                        -25856.284
Sample:                                            0                                         
                                              - 1555                                         
Covariance Type:                                 opg                                         
========================================================================================
Ljung-Box (L1) (Q):       0.00, 0.00, 0.00   Jarque-Bera (JB):   125.80, 4928.64, 226.91
Prob(Q):                  0.98, 1.00, 0.99   Prob(JB):                  0.00, 0.00, 0.00
Heteroskedasticity (H):   0.95, 1.52, 0.47   Skew:                    0.07, -0.43, -0.03
Prob(H) (two-sided):      0.56, 0.00, 0.00   Kurtosis:                 4.39, 11.68, 4.87
                             Results for equation NLMK.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.NLMK         0.0012      0.001      1.538      0.124      -0.000       0.003
L1.NLMK.ME.NLMK       -0.0854      0.310     -0.275      0.783      -0.693       0.522
L1.GMKN.ME.NLMK        0.0339      0.485      0.070      0.944      -0.916       0.984
L1.CHMF.ME.NLMK        0.0770      0.492      0.157      0.876      -0.887       1.041
L1.e(NLMK.ME).NLMK    -0.0073      0.314     -0.023      0.981      -0.622       0.607
L1.e(GMKN.ME).NLMK     0.0010      0.486      0.002      0.998      -0.951       0.953
L1.e(CHMF.ME).NLMK     0.0076      0.494      0.015      0.988      -0.961       0.976
                             Results for equation GMKN.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.GMKN         0.0005      0.001      0.645      0.519      -0.001       0.002
L1.NLMK.ME.GMKN        0.0103      0.311      0.033      0.973      -0.598       0.619
L1.GMKN.ME.GMKN        0.0410      0.472      0.087      0.931      -0.883       0.965
L1.CHMF.ME.GMKN        0.0136      0.484      0.028      0.978      -0.936       0.963
L1.e(NLMK.ME).GMKN     0.0027      0.311      0.009      0.993      -0.608       0.613
L1.e(GMKN.ME).GMKN     0.0011      0.470      0.002      0.998      -0.921       0.923
L1.e(CHMF.ME).GMKN    -0.0017      0.490     -0.004      0.997      -0.962       0.958
                             Results for equation CHMF.ME                             
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
intercept.CHMF         0.0010      0.001      1.395      0.163      -0.000       0.002
L1.NLMK.ME.CHMF        0.0066      0.280      0.024      0.981      -0.542       0.555
L1.GMKN.ME.CHMF        0.0359      0.449      0.080      0.936      -0.845       0.916
L1.CHMF.ME.CHMF       -0.0610      0.450     -0.136      0.892      -0.942       0.820
L1.e(NLMK.ME).CHMF     0.0029      0.280      0.011      0.992      -0.545       0.551
L1.e(GMKN.ME).CHMF     0.0004      0.449      0.001      0.999      -0.880       0.881
L1.e(CHMF.ME).CHMF    -0.0027      0.453     -0.006      0.995      -0.891       0.885
                                  Error covariance matrix                                   
============================================================================================
                               coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------------
sqrt.var.NLMK.ME             0.0173      0.000     66.616      0.000       0.017       0.018
sqrt.cov.NLMK.ME.GMKN.ME     0.0055      0.000     17.215      0.000       0.005       0.006
sqrt.var.GMKN.ME             0.0165      0.000    102.208      0.000       0.016       0.017
sqrt.cov.NLMK.ME.CHMF.ME     0.0098      0.000     30.911      0.000       0.009       0.010
sqrt.cov.GMKN.ME.CHMF.ME     0.0028      0.000      9.734      0.000       0.002       0.003
sqrt.var.CHMF.ME             0.0117      0.000     74.064      0.000       0.011       0.012
============================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

=== US_TECH ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:14:03
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -24.2448
Nobs:                     2452.00    HQIC:                  -24.2629
Log likelihood:           19333.3    FPE:                2.87258e-11
AIC:                     -24.2732    Det(Omega_mle):     2.85856e-11
--------------------------------------------------------------------
Results for equation AAPL
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000952         0.000371            2.564           0.010
L1.AAPL        -0.001404         0.028333           -0.050           0.960
L1.MSFT        -0.118671         0.032959           -3.601           0.000
L1.NVDA         0.017941         0.015503            1.157           0.247
==========================================================================

Results for equation MSFT
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.001089         0.000339            3.210           0.001
L1.AAPL        -0.063696         0.025868           -2.462           0.014
L1.MSFT        -0.147771         0.030093           -4.911           0.000
L1.NVDA         0.033094         0.014154            2.338           0.019
==========================================================================

Results for equation NVDA
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.002448         0.000631            3.877           0.000
L1.AAPL        -0.051237         0.048164           -1.064           0.287
L1.MSFT        -0.172090         0.056030           -3.071           0.002
L1.NVDA        -0.001919         0.026354           -0.073           0.942
==========================================================================

Correlation matrix of residuals
            AAPL      MSFT      NVDA
AAPL    1.000000  0.685036  0.545302
MSFT    0.685036  1.000000  0.621345
NVDA    0.545302  0.621345  1.000000




===== VAR SUMMARY () =====
Det. terms outside the coint. relation & lagged endog. parameters for equation AAPL
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL        0.0627      0.026      2.445      0.014       0.012       0.113
L1.MSFT       -0.0439      0.016     -2.786      0.005      -0.075      -0.013
L1.NVDA       -0.0453      0.031     -1.443      0.149      -0.107       0.016
Det. terms outside the coint. relation & lagged endog. parameters for equation MSFT
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL       -0.0934      0.044     -2.114      0.035      -0.180      -0.007
L1.MSFT       -0.0771      0.027     -2.837      0.005      -0.130      -0.024
L1.NVDA        0.1349      0.054      2.494      0.013       0.029       0.241
Det. terms outside the coint. relation & lagged endog. parameters for equation NVDA
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.AAPL        0.0211      0.018      1.146      0.252      -0.015       0.057
L1.MSFT       -0.0182      0.011     -1.607      0.108      -0.040       0.004
L1.NVDA       -0.0972      0.023     -4.308      0.000      -0.141      -0.053
                Loading coefficients (alpha) for equation AAPL                
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1           -0.0133      0.003     -4.964      0.000      -0.018      -0.008
                Loading coefficients (alpha) for equation MSFT                
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1           -0.0143      0.005     -3.099      0.002      -0.023      -0.005
                Loading coefficients (alpha) for equation NVDA                
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ec1           -0.0107      0.002     -5.586      0.000      -0.014      -0.007
          Cointegration relations for loading-coefficients-column 1           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
beta.1         1.0000          0          0      0.000       1.000       1.000
beta.2        -0.5593      0.020    -27.539      0.000      -0.599      -0.519
beta.3         0.0659      0.096      0.689      0.491      -0.122       0.253
==============================================================================

===== VAR SUMMARY () =====
                              Statespace Model Results                              
====================================================================================
Dep. Variable:     ['AAPL', 'MSFT', 'NVDA']   No. Observations:                 2453
Model:                           VARMA(1,1)   Log Likelihood               19342.473
                                + intercept   AIC                         -38630.946
Date:                      Sun, 23 Nov 2025   BIC                         -38474.209
Time:                              21:16:17   HQIC                        -38573.988
Sample:                                   0                                         
                                     - 2453                                         
Covariance Type:                        opg                                         
===========================================================================================
Ljung-Box (L1) (Q):       0.00, 0.00, 0.00   Jarque-Bera (JB):   3850.32, 1687.74, 18448.11
Prob(Q):                  0.99, 1.00, 1.00   Prob(JB):                     0.00, 0.00, 0.00
Heteroskedasticity (H):   1.34, 1.34, 1.10   Skew:                       -0.10, -0.14, 0.51
Prob(H) (two-sided):      0.00, 0.00, 0.16   Kurtosis:                    9.13, 7.05, 16.40
                          Results for equation AAPL                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0009      0.001      1.695      0.090      -0.000       0.002
L1.AAPL       -0.0014      0.678     -0.002      0.998      -1.331       1.328
L1.MSFT       -0.1187      0.529     -0.224      0.822      -1.156       0.918
L1.NVDA        0.0179      0.339      0.053      0.958      -0.646       0.682
L1.e(AAPL)    -0.0005      0.678     -0.001      0.999      -1.330       1.329
L1.e(MSFT)    -0.0052      0.531     -0.010      0.992      -1.046       1.036
L1.e(NVDA)     0.0016      0.337      0.005      0.996      -0.659       0.662
                          Results for equation MSFT                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0011      0.001      2.095      0.036       7e-05       0.002
L1.AAPL       -0.0637      0.684     -0.093      0.926      -1.405       1.278
L1.MSFT       -0.1478      0.532     -0.278      0.781      -1.191       0.895
L1.NVDA        0.0331      0.339      0.098      0.922      -0.631       0.697
L1.e(AAPL)     0.0012      0.684      0.002      0.999      -1.340       1.342
L1.e(MSFT)    -0.0068      0.534     -0.013      0.990      -1.053       1.039
L1.e(NVDA)     0.0006      0.338      0.002      0.999      -0.662       0.663
                          Results for equation NVDA                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0024      0.001      2.384      0.017       0.000       0.004
L1.AAPL       -0.0512      1.404     -0.036      0.971      -2.803       2.700
L1.MSFT       -0.1722      0.974     -0.177      0.860      -2.081       1.736
L1.NVDA       -0.0019      0.597     -0.003      0.997      -1.173       1.169
L1.e(AAPL)    -0.0002      1.406     -0.000      1.000      -2.755       2.755
L1.e(MSFT)    -0.0050      0.977     -0.005      0.996      -1.920       1.910
L1.e(NVDA)     0.0023      0.597      0.004      0.997      -1.169       1.173
                               Error covariance matrix                                
======================================================================================
                         coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------------
sqrt.var.AAPL          0.0183      0.000    120.100      0.000       0.018       0.019
sqrt.cov.AAPL.MSFT     0.0115      0.000     62.429      0.000       0.011       0.012
sqrt.var.MSFT          0.0122      0.000    105.752      0.000       0.012       0.012
sqrt.cov.AAPL.NVDA     0.0170      0.001     33.795      0.000       0.016       0.018
sqrt.cov.MSFT.NVDA     0.0106      0.000     23.702      0.000       0.010       0.011
sqrt.var.NVDA          0.0239      0.000    174.496      0.000       0.024       0.024
======================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

=== US_RETAIL ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:16:18
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.4581
Nobs:                     2451.00    HQIC:                  -25.4898
Log likelihood:           20847.4    FPE:                8.35753e-12
AIC:                     -25.5079    Det(Omega_mle):     8.28633e-12
--------------------------------------------------------------------
Results for equation WMT
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000751         0.000274            2.735           0.006
L1.WMT         -0.040319         0.025120           -1.605           0.108
L1.COST         0.000445         0.024734            0.018           0.986
L1.TGT         -0.029625         0.014751           -2.008           0.045
L2.WMT          0.023747         0.025070            0.947           0.344
L2.COST        -0.019437         0.024728           -0.786           0.432
L2.TGT         -0.005943         0.014765           -0.403           0.687
==========================================================================

Results for equation COST
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000797         0.000283            2.820           0.005
L1.WMT          0.003651         0.025876            0.141           0.888
L1.COST        -0.033236         0.025478           -1.304           0.192
L1.TGT         -0.001459         0.015195           -0.096           0.924
L2.WMT          0.058651         0.025824            2.271           0.023
L2.COST        -0.029069         0.025472           -1.141           0.254
L2.TGT         -0.008328         0.015210           -0.548           0.584
==========================================================================

Results for equation TGT
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.000188         0.000427            0.439           0.660
L1.WMT          0.014709         0.039130            0.376           0.707
L1.COST        -0.027493         0.038529           -0.714           0.475
L1.TGT         -0.016082         0.022978           -0.700           0.484
L2.WMT          0.081409         0.039052            2.085           0.037
L2.COST         0.016801         0.038520            0.436           0.663
L2.TGT         -0.039597         0.023000           -1.722           0.085
==========================================================================

Correlation matrix of residuals
             WMT      COST       TGT
WMT     1.000000  0.565825  0.406222
COST    0.565825  1.000000  0.434701
TGT     0.406222  0.434701  1.000000



VECM: No cointegration (rank = 0). Skipping.

===== VAR SUMMARY () =====
                             Statespace Model Results                             
==================================================================================
Dep. Variable:     ['WMT', 'COST', 'TGT']   No. Observations:                 2453
Model:                         VARMA(1,1)   Log Likelihood               20860.977
                              + intercept   AIC                         -41667.954
Date:                    Sun, 23 Nov 2025   BIC                         -41511.218
Time:                            21:16:45   HQIC                        -41610.997
Sample:                                 0                                         
                                   - 2453                                         
Covariance Type:                      opg                                         
============================================================================================
Ljung-Box (L1) (Q):       0.00, 0.00, 0.00   Jarque-Bera (JB):   19532.40, 4927.55, 85294.34
Prob(Q):                  1.00, 1.00, 1.00   Prob(JB):                      0.00, 0.00, 0.00
Heteroskedasticity (H):   1.06, 1.02, 2.13   Skew:                       -0.09, -0.59, -0.85
Prob(H) (two-sided):      0.41, 0.78, 0.00   Kurtosis:                    16.82, 9.84, 31.84
                           Results for equation WMT                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0007      0.001      1.109      0.267      -0.001       0.002
L1.WMT        -0.0407      0.309     -0.132      0.895      -0.646       0.564
L1.COST        0.0007      1.084      0.001      0.999      -2.124       2.125
L1.TGT        -0.0293      0.997     -0.029      0.977      -1.983       1.924
L1.e(WMT)      0.0004      0.311      0.001      0.999      -0.610       0.611
L1.e(COST)    -0.0003      1.084     -0.000      1.000      -2.125       2.124
L1.e(TGT)     -0.0003      0.996     -0.000      1.000      -1.953       1.952
                          Results for equation COST                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0008      0.001      1.114      0.265      -0.001       0.002
L1.WMT         0.0013      0.366      0.004      0.997      -0.716       0.718
L1.COST       -0.0322      1.137     -0.028      0.977      -2.261       2.197
L1.TGT        -0.0009      1.152     -0.001      0.999      -2.260       2.258
L1.e(WMT)      0.0024      0.367      0.006      0.995      -0.718       0.723
L1.e(COST)    -0.0010      1.138     -0.001      0.999      -2.232       2.230
L1.e(TGT)     -0.0006      1.152     -0.000      1.000      -2.259       2.258
                           Results for equation TGT                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0003      0.001      0.217      0.828      -0.002       0.003
L1.WMT         0.0128      0.566      0.023      0.982      -1.097       1.122
L1.COST       -0.0281      1.762     -0.016      0.987      -3.482       3.426
L1.TGT        -0.0153      1.651     -0.009      0.993      -3.251       3.221
L1.e(WMT)      0.0019      0.572      0.003      0.997      -1.119       1.123
L1.e(COST)     0.0006      1.760      0.000      1.000      -3.449       3.450
L1.e(TGT)     -0.0008      1.654     -0.000      1.000      -3.243       3.241
                               Error covariance matrix                               
=====================================================================================
                        coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------
sqrt.var.WMT          0.0135   9.47e-05    142.533      0.000       0.013       0.014
sqrt.cov.WMT.COST     0.0079      0.000     43.864      0.000       0.008       0.008
sqrt.var.COST         0.0115   9.73e-05    117.870      0.000       0.011       0.012
sqrt.cov.WMT.TGT      0.0086      0.000     28.496      0.000       0.008       0.009
sqrt.cov.COST.TGT     0.0053      0.000     15.664      0.000       0.005       0.006
sqrt.var.TGT          0.0185   8.64e-05    214.489      0.000       0.018       0.019
=====================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

=== US_ENERGY ===

===== VAR SUMMARY () =====
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 23, Nov, 2025
Time:                     21:16:48
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -25.7287
Nobs:                     2446.00    HQIC:                  -25.8284
Log likelihood:           21311.5    FPE:                5.73029e-12
AIC:                     -25.8853    Det(Omega_mle):     5.57841e-12
--------------------------------------------------------------------
Results for equation XOM
=========================================================================
            coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------
const          0.000343         0.000354            0.969           0.333
L1.XOM         0.113682         0.038986            2.916           0.004
L1.CVX        -0.136966         0.038475           -3.560           0.000
L1.COP        -0.011474         0.026650           -0.431           0.667
L2.XOM        -0.094745         0.039014           -2.428           0.015
L2.CVX         0.072750         0.038507            1.889           0.059
L2.COP         0.039540         0.026648            1.484           0.138
L3.XOM        -0.099360         0.039113           -2.540           0.011
L3.CVX        -0.015058         0.038534           -0.391           0.696
L3.COP         0.070318         0.026629            2.641           0.008
L4.XOM         0.047236         0.039216            1.204           0.228
L4.CVX        -0.121018         0.038534           -3.141           0.002
L4.COP         0.068288         0.026642            2.563           0.010
L5.XOM         0.045312         0.039156            1.157           0.247
L5.CVX        -0.099727         0.038626           -2.582           0.010
L5.COP         0.071352         0.026676            2.675           0.007
L6.XOM         0.044985         0.039212            1.147           0.251
L6.CVX        -0.030967         0.038559           -0.803           0.422
L6.COP        -0.032552         0.026672           -1.220           0.222
L7.XOM        -0.064725         0.039205           -1.651           0.099
L7.CVX         0.049542         0.038646            1.282           0.200
L7.COP         0.054929         0.026685            2.058           0.040
=========================================================================

Results for equation CVX
=========================================================================
            coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------
const          0.000415         0.000369            1.124           0.261
L1.XOM         0.093206         0.040656            2.293           0.022
L1.CVX        -0.162794         0.040123           -4.057           0.000
L1.COP         0.007803         0.027792            0.281           0.779
L2.XOM        -0.114049         0.040686           -2.803           0.005
L2.CVX         0.108657         0.040157            2.706           0.007
L2.COP         0.039981         0.027790            1.439           0.150
L3.XOM        -0.017997         0.040789           -0.441           0.659
L3.CVX        -0.060210         0.040185           -1.498           0.134
L3.COP         0.053986         0.027770            1.944           0.052
L4.XOM         0.080533         0.040897            1.969           0.049
L4.CVX        -0.162699         0.040185           -4.049           0.000
L4.COP         0.070170         0.027784            2.526           0.012
L5.XOM         0.019586         0.040833            0.480           0.631
L5.CVX        -0.113299         0.040281           -2.813           0.005
L5.COP         0.111221         0.027819            3.998           0.000
L6.XOM         0.041323         0.040892            1.011           0.312
L6.CVX        -0.102965         0.040211           -2.561           0.010
L6.COP         0.005072         0.027815            0.182           0.855
L7.XOM        -0.073690         0.040884           -1.802           0.071
L7.CVX         0.132302         0.040302            3.283           0.001
L7.COP         0.065973         0.027828            2.371           0.018
=========================================================================

Results for equation COP
=========================================================================
            coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------
const          0.000416         0.000490            0.848           0.396
L1.XOM         0.163503         0.053967            3.030           0.002
L1.CVX        -0.247960         0.053260           -4.656           0.000
L1.COP         0.012411         0.036891            0.336           0.737
L2.XOM        -0.032249         0.054007           -0.597           0.550
L2.CVX         0.043107         0.053305            0.809           0.419
L2.COP         0.017356         0.036888            0.471           0.638
L3.XOM        -0.077608         0.054143           -1.433           0.152
L3.CVX        -0.064440         0.053341           -1.208           0.227
L3.COP         0.105856         0.036862            2.872           0.004
L4.XOM         0.119389         0.054286            2.199           0.028
L4.CVX        -0.214488         0.053342           -4.021           0.000
L4.COP         0.064303         0.036881            1.744           0.081
L5.XOM        -0.036777         0.054202           -0.679           0.497
L5.CVX        -0.054253         0.053469           -1.015           0.310
L5.COP         0.097362         0.036927            2.637           0.008
L6.XOM         0.049533         0.054280            0.913           0.361
L6.CVX        -0.044973         0.053376           -0.843           0.399
L6.COP        -0.042748         0.036922           -1.158           0.247
L7.XOM        -0.079057         0.054270           -1.457           0.145
L7.CVX         0.131870         0.053497            2.465           0.014
L7.COP         0.058099         0.036939            1.573           0.116
=========================================================================

Correlation matrix of residuals
            XOM       CVX       COP
XOM    1.000000  0.830630  0.790408
CVX    0.830630  1.000000  0.806175
COP    0.790408  0.806175  1.000000



VECM: No cointegration (rank = 0). Skipping.

===== VAR SUMMARY () =====
                             Statespace Model Results                            
=================================================================================
Dep. Variable:     ['XOM', 'CVX', 'COP']   No. Observations:                 2453
Model:                        VARMA(1,1)   Log Likelihood               21251.837
                             + intercept   AIC                         -42449.674
Date:                   Sun, 23 Nov 2025   BIC                         -42292.937
Time:                           21:17:53   HQIC                        -42392.716
Sample:                                0                                         
                                  - 2453                                         
Covariance Type:                     opg                                         
============================================================================================
Ljung-Box (L1) (Q):       0.00, 0.00, 0.00   Jarque-Bera (JB):   3876.01, 102545.79, 3869.97
Prob(Q):                  0.98, 0.98, 0.97   Prob(JB):                      0.00, 0.00, 0.00
Heteroskedasticity (H):   1.95, 0.87, 0.52   Skew:                       -0.23, -1.01, -0.09
Prob(H) (two-sided):      0.00, 0.04, 0.00   Kurtosis:                     9.14, 34.61, 9.15
                           Results for equation XOM                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0003      0.001      0.655      0.512      -0.001       0.001
L1.XOM         0.1238      1.216      0.102      0.919      -2.260       2.507
L1.CVX        -0.1414      1.519     -0.093      0.926      -3.119       2.836
L1.COP        -0.0192      1.560     -0.012      0.990      -3.077       3.039
L1.e(XOM)      0.0003      1.213      0.000      1.000      -2.376       2.377
L1.e(CVX)      0.0028      1.523      0.002      0.999      -2.982       2.988
L1.e(COP)      0.0038      1.561      0.002      0.998      -3.055       3.063
                           Results for equation CVX                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0004      0.001      0.822      0.411      -0.001       0.001
L1.XOM         0.1121      1.147      0.098      0.922      -2.136       2.360
L1.CVX        -0.1841      1.327     -0.139      0.890      -2.786       2.418
L1.COP        -0.0008      1.437     -0.001      1.000      -2.817       2.816
L1.e(XOM)     -0.0010      1.143     -0.001      0.999      -2.242       2.240
L1.e(CVX)      0.0056      1.330      0.004      0.997      -2.602       2.613
L1.e(COP)      0.0052      1.438      0.004      0.997      -2.813       2.824
                           Results for equation COP                           
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.0004      0.001      0.595      0.552      -0.001       0.002
L1.XOM         0.1817      1.499      0.121      0.903      -2.755       3.119
L1.CVX        -0.2642      1.907     -0.139      0.890      -4.002       3.474
L1.COP         0.0038      1.966      0.002      0.998      -3.850       3.857
L1.e(XOM)     -0.0002      1.496     -0.000      1.000      -2.933       2.932
L1.e(CVX)      0.0023      1.916      0.001      0.999      -3.752       3.757
L1.e(COP)      0.0026      1.968      0.001      0.999      -3.855       3.860
                              Error covariance matrix                               
====================================================================================
                       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------------
sqrt.var.XOM         0.0176      0.000    111.287      0.000       0.017       0.018
sqrt.cov.XOM.CVX     0.0155      0.000     71.245      0.000       0.015       0.016
sqrt.var.CVX         0.0104   7.92e-05    131.098      0.000       0.010       0.011
sqrt.cov.XOM.COP     0.0194      0.000     75.469      0.000       0.019       0.020
sqrt.cov.CVX.COP     0.0066      0.000     31.837      0.000       0.006       0.007
sqrt.var.COP         0.0133      0.000    120.622      0.000       0.013       0.014
====================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
In [48]:
import pandas as pd
import matplotlib.pyplot as plt

def plot_error_bars(results):
    """
    Строит barplot RMSE/MAE для VAR, VECM, VARMAX для всех секторов.
    """
    rows = []

    for sector, data in results.items():
        m = data["metrics"]
        rows.append({
            "sector": sector,
            "VAR_RMSE": m.get("VAR_RMSE"),
            "VAR_MAE": m.get("VAR_MAE"),
            "VECM_RMSE": m.get("VECM_RMSE"),
            "VECM_MAE": m.get("VECM_MAE"),
            "VARMAX_RMSE": m.get("VARMAX_RMSE"),
            "VARMAX_MAE": m.get("VARMAX_MAE"),
        })

    df = pd.DataFrame(rows).set_index("sector")
    display(df)

    # ---- RMSE barplot ----
    plt.figure(figsize=(12,5))
    df[[c for c in df.columns if "RMSE" in c]].plot(kind="bar", figsize=(14,5))
    plt.title("RMSE comparison per model")
    plt.grid(True)
    plt.show()

    # ---- MAE barplot ----
    plt.figure(figsize=(12,5))
    df[[c for c in df.columns if "MAE" in c]].plot(kind="bar", figsize=(14,5))
    plt.title("MAE comparison per model")
    plt.grid(True)
    plt.show()

    return df

plot_error_bars(results)
VAR_RMSE VAR_MAE VECM_RMSE VECM_MAE VARMAX_RMSE VARMAX_MAE
sector
rus_fin 1545.960583 802.913731 1260.582366 655.858409 1430.160090 737.081858
rus_oil 1157.329366 631.532829 NaN NaN 1159.897044 632.803920
rus_met 912.693196 538.815879 NaN NaN 921.420200 543.647921
us_tech 17.023856 13.552110 13.220559 10.396325 17.003229 13.534082
us_retail 28.520621 17.692414 NaN NaN 28.544865 17.717772
us_energy 6.413615 5.110084 NaN NaN 6.274585 4.986953
<Figure size 1200x500 with 0 Axes>
No description has been provided for this image
<Figure size 1200x500 with 0 Axes>
No description has been provided for this image
Out[48]:
VAR_RMSE VAR_MAE VECM_RMSE VECM_MAE VARMAX_RMSE VARMAX_MAE
sector
rus_fin 1545.960583 802.913731 1260.582366 655.858409 1430.160090 737.081858
rus_oil 1157.329366 631.532829 NaN NaN 1159.897044 632.803920
rus_met 912.693196 538.815879 NaN NaN 921.420200 543.647921
us_tech 17.023856 13.552110 13.220559 10.396325 17.003229 13.534082
us_retail 28.520621 17.692414 NaN NaN 28.544865 17.717772
us_energy 6.413615 5.110084 NaN NaN 6.274585 4.986953

Расширим прогноз.

In [49]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.vector_ar.vecm import VECM, coint_johansen
from statsmodels.tsa.statespace.varmax import VARMAX

# ---------- utilities ----------
def future_index_from_last(last_index, h):
    start = last_index[-1] + pd.Timedelta(days=1)
    try:
        return pd.bdate_range(start=start, periods=h)
    except:
        return pd.date_range(start=start, periods=h, freq='B')

def safe_reconstruct(last_price, fc_returns_df):
    common = [c for c in fc_returns_df.columns if c in last_price.index]
    if len(common) == 0:
        raise ValueError("Нет общих тикеров между last_price и fc_returns_df")
    fc = fc_returns_df[common].copy()
    prev = last_price[common].astype(float).copy()
    rows = []
    for i in range(len(fc)):
        prev = prev * np.exp(fc.iloc[i])
        rows.append(prev.copy())
    return pd.DataFrame(rows, index=fc.index, columns=fc.columns)

def mc_simulate_from_var(var_res, last_price, returns_train, h, n_sims=500, seed=0):
    np.random.seed(seed)
    Σ = getattr(var_res, "sigma_u", None)
    if Σ is None:
        # fallback: cov of residuals
        try:
            Σ = np.cov(var_res.resid.T)
        except:
            Σ = np.cov(returns_train.values.T)
    mean_fc = var_res.forecast(returns_train.values[-var_res.k_ar:], steps=h)
    idx = future_index_from_last(returns_train.index, h)
    mean_fc_df = pd.DataFrame(mean_fc, index=idx, columns=returns_train.columns)

    k = mean_fc.shape[1]
    sims_prices = np.zeros((n_sims, h, k))
    last = last_price.values.astype(float)

    for s in range(n_sims):
        prev = last.copy()
        shocks = np.random.multivariate_normal(np.zeros(k), Σ, size=h)
        for t in range(h):
            r = mean_fc[t] + shocks[t]
            prev = prev * np.exp(r)
            sims_prices[s,t,:] = prev

    mean = sims_prices.mean(axis=0)
    low = np.quantile(sims_prices, 0.05, axis=0)
    high = np.quantile(sims_prices, 0.95, axis=0)
    cols = returns_train.columns
    mean_df = pd.DataFrame(mean, index=idx, columns=cols)
    low_df  = pd.DataFrame(low,  index=idx, columns=cols)
    high_df = pd.DataFrame(high, index=idx, columns=cols)
    return mean_df, low_df, high_df

# ---------- main function: forecast into future for one sector ----------
def forecast_future_all_models(csv_path, sector_name, horizons=[10,20,100], n_sims=500, out_dir="FORECASTS_FUT"):
    os.makedirs(out_dir, exist_ok=True)
    df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True).sort_index()
    df_prices = df_prices.apply(pd.to_numeric, errors='coerce').dropna(axis=1, how='all')

    # returns for VAR/VARMAX training
    df_ret = np.log(df_prices).diff().dropna(how='any')
    if df_ret.shape[0] < 30:
        raise ValueError("Too few obs for modelling.")

    last_price = df_prices.iloc[-1]

    results = {}

    # ---------- VAR: fit on full returns ----------
    try:
        model_var = VAR(df_ret)
        sel = model_var.select_order(8)
        chosen = sel.selected_orders.get("aic") or sel.selected_orders.get("bic") or 2
        lag = int(chosen)
        var_res = model_var.fit(lag)
    except Exception as e:
        var_res = None
        print("VAR fit failed:", e)

    # ---------- VECM: fit on levels if cointegration ----------
    vecm_res = None
    try:
        joh = coint_johansen(df_prices, det_order=0, k_ar_diff=1)
        r = sum(joh.lr1 > joh.cvt[:,1])
        if r > 0 and r < df_prices.shape[1]:
            vecm = VECM(df_prices, k_ar_diff=1, coint_rank=r)
            vecm_res = vecm.fit()
    except Exception as e:
        vecm_res = None

    # ---------- VARMAX: simple varmax on returns (no exog) ----------
    varmax_res = None
    try:
        vm = VARMAX(df_ret, order=(1,1))
        varmax_res = vm.fit(disp=False)
    except Exception as e:
        varmax_res = None

    # For each horizon produce forecasts and MC
    for h in horizons:
        print(f"Sector {sector_name} — horizon {h}")
        idx_future = future_index_from_last(df_prices.index, h)

        # VAR forecast (future)
        if var_res is not None:
            fc_returns = var_res.forecast(df_ret.values[-var_res.k_ar:], steps=h)
            fc_returns_df = pd.DataFrame(fc_returns, index=idx_future, columns=df_ret.columns)
            fc_prices_var = safe_reconstruct(last_price, fc_returns_df)
            mean_var, low_var, high_var = mc_simulate_from_var(var_res, last_price, df_ret, h, n_sims=n_sims)
            results.setdefault("VAR", {})[f"h{h}_fc_prices"] = fc_prices_var
            results["VAR"][f"h{h}_mean_mc"] = mean_var
            results["VAR"][f"h{h}_low_mc"] = low_var
            results["VAR"][f"h{h}_high_mc"] = high_var
            # save
            fc_returns_df.to_csv(os.path.join(out_dir, f"{sector_name}_VAR_returns_h{h}.csv"))
            fc_prices_var.to_csv(os.path.join(out_dir, f"{sector_name}_VAR_prices_h{h}.csv"))

            # plot per column
            for col in fc_prices_var.columns:
                plt.figure(figsize=(10,4))
                # history tail
                hist = df_prices[col].iloc[-250:]
                plt.plot(hist.index, hist.values, color="black", label="History")
                # mean
                plt.plot(mean_var.index, mean_var[col], color="blue", label="VAR MC mean")
                # interval
                plt.fill_between(mean_var.index, low_var[col], high_var[col], color="blue", alpha=0.2, label="90% interval")
                plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
                plt.title(f"{sector_name} — {col} — VAR forecast (h={h})")
                plt.legend()
                plt.grid(True)
                plt.tight_layout()
                plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VAR_h{h}.png"))
                plt.show()

        # VECM forecast (levels) - only if fitted
        if vecm_res is not None:
            try:
                fc_levels = vecm_res.predict(steps=h)
                fc_df = pd.DataFrame(fc_levels, index=idx_future, columns=df_prices.columns)
                results.setdefault("VECM", {})[f"h{h}_fc_levels"] = fc_df
                fc_df.to_csv(os.path.join(out_dir, f"{sector_name}_VECM_levels_h{h}.csv"))
                # quick plot (no MC implemented robustly)
                for col in fc_df.columns:
                    plt.figure(figsize=(10,4))
                    hist = df_prices[col].iloc[-250:]
                    plt.plot(hist.index, hist.values, color="black", label="History")
                    plt.plot(fc_df.index, fc_df[col], "--", color="green", label="VECM forecast")
                    plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
                    plt.title(f"{sector_name} — {col} — VECM forecast (h={h})")
                    plt.legend()
                    plt.grid(True)
                    plt.tight_layout()
                    plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VECM_h{h}.png"))
                    plt.show()
            except Exception as e:
                print("VECM forecast failed for horizon", h, e)

        # VARMAX forecast (future) - if fitted
        if varmax_res is not None:
            try:
                # need to forecast exog if any; here using no-exog case
                fc_ret = varmax_res.get_forecast(steps=h).predicted_mean
                fc_ret.index = idx_future
                fc_prices_vm = safe_reconstruct(last_price, fc_ret)
                results.setdefault("VARMAX", {})[f"h{h}_fc_prices"] = fc_prices_vm
                fc_ret.to_csv(os.path.join(out_dir, f"{sector_name}_VARMAX_returns_h{h}.csv"))
                fc_prices_vm.to_csv(os.path.join(out_dir, f"{sector_name}_VARMAX_prices_h{h}.csv"))
                for col in fc_prices_vm.columns:
                    plt.figure(figsize=(10,4))
                    hist = df_prices[col].iloc[-250:]
                    plt.plot(hist.index, hist.values, color="black", label="History")
                    plt.plot(fc_prices_vm.index, fc_prices_vm[col], "--", color="purple", label="VARMAX forecast")
                    plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
                    plt.title(f"{sector_name} — {col} — VARMAX forecast (h={h})")
                    plt.legend()
                    plt.grid(True)
                    plt.tight_layout()
                    plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VARMAX_h{h}.png"))
                    plt.show()
            except Exception as e:
                print("VARMAX forecast failed:", e)

    return results
In [50]:
for sector, path in SECTORS.items():
    res = forecast_future_all_models(path, sector)
Sector rus_fin — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_fin — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_fin — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_oil — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_oil — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_oil — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_met — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_met — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector rus_met — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_tech — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_tech — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_tech — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_retail — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_retail — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_retail — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_energy — horizon 10
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_energy — horizon 20
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Sector us_energy — horizon 100
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Отчет¶

Построены многомерные модели временных рядов: VAR, VARMAX и VECM, выполнены тесты стационарности, коинтеграции и прогнозирование на различные горизонты для 6 секторов:¶

Данные и предварительная обработка¶

  • Использовались исторические цены Yahoo Finance (1d частота).

  • В логах точно указано количество загруженных строк, например:

    • GAZP.ME: успешно загружен — 6570 строк
    • AAPL: успешно загружен — 11320 строк
  • Некоторые тикеры РФ имеют сильно укороченные ряды (особенно VTBR).

  • Все данные синхронизированы по датам, пропуски удалены.

  • Используемые преобразования:

    • логарифм цен
    • лог-доходности:
      $$ r_t = \log P_t - \log P_{t-1} $$
    • уровни хранятся отдельно для VECM.

Разведочный анализ данных (EDA)¶

Корреляции внутри секторов¶

🇷🇺 Финансы¶

  • Средняя корреляция: 0.764
  • Отдельные p-values ADF/KPSS:
    • SBER: ADF p-value ≈ $6.27 \times 10^{-15}$, KPSS p ≈ 0.1
    • VTBR: ADF p-value ≈ $4.69 \times 10^{-14}$, KPSS p ≈ 0.1
    • TCSG: ADF p-value ≈ $1.13 \times 10^{-29}$, KPSS p ≈ 0.0536

🇷🇺 Металлургия¶

  • Средняя корреляция: 0.511
  • ADF p-values:
    • NLMK: $2.13 \times 10^{-26}$
    • GMKN: $0.0$
    • CHMF: $1.23 \times 10^{-16}$

🇺🇸 Tech¶

  • Корреляции доходностей 0.7–0.95
  • Высокая синхронность: все тикеры показывают почти идеальный долгосрочный растущий тренд.

Вывод:¶

Американские сектора более устойчивы и согласованы. Российские — более шумные, с выбросами и разрывами рядов.


Стационарность: (ADF / KPSS)¶

На уровнях:¶

Везде ADF p-value > 0.9, KPSS p-value < 0.01 →
уровни нестационарные (I(1)).

На лог-доходностях:¶

По логам:

  • SBER: ADF p = 6.27e-15
  • VTBR: ADF p = 4.69e-14
  • TCSG: ADF p = 1.13e-29
  • NLMK: ADF p = 2.13e-26
  • GMKN: ADF p = 0
  • CHMF: ADF p = 1.23e-16

Все доходности стационарны (ADF p < 0.05), KPSS p ≈ 0.1 $\implies$ принимается стационарность.


Тест на коинтеграцию (Johansen)¶

Сектор Johansen trace rank
rus_fin 0
rus_oil 1
rus_met 1
us_tech 1
us_retail 1
us_energy 0

Интерпретация:¶

  • rank = 1 означает наличие долгосрочного равновесия между акциями сектора.
  • Это ожидаемо:
    • Tech USA работает как единая система
    • Нефтегаз РФ связан через цены на нефть
    • Металлургия РФ — через сырьевые циклы
  • rus_fin и us_energy не коинтегрированы.

VAR: статистический анализ¶

Выбор лагов¶

  • rus_fin → lag = 1
  • rus_oil → lag = 2
  • rus_met → lag = 1
  • us_tech → lag = 2
  • us_retail → lag = 2
  • us_energy → lag = 1

Значимость коэффициентов¶

По summary:

  • В США большинство лагов t-stat > 2 → статистически значимы.
  • В РФ много коэффициентов незначимо (нестабильные рынки и короткие ряды).

Стабильность¶

Все VAR модели стабильны:
корни companion matrix < 1.0
(например SBER roots: 0.87, 0.72, 0.69).


VARMAX¶

  • VARMAX сходится устойчивее VAR.
  • MA-компонента снижает шум:
    rolling_RMSE_mean ≈ 0.008–0.012 по секторам США,
    ≈ 0.015–0.03 по РФ
  • Иногда чрезмерно сглаживает прогноз (визуальная «линия»).

VARMAX почти всегда улучшает MAE по сравнению с VAR.


VECM¶

VECM применяется, когда Johansen rank > 0.

  • rus_oil: скорость приведения α ≈ -0.25 … -0.4 (значимо)
  • rus_met: α ≈ -0.15 … -0.3
  • us_tech: α ≈ -0.1 … -0.2

Экономический смысл:¶

  • Сильные отрицательные α означают более быстрый возврат к равновесию → устойчивые 100-дневные прогнозы.

Диагностика моделей¶

Ljung–Box p-values:¶

  • США ↦ p ≈ 0.20–0.35 → нет автокорреляции
  • РФ ↦ p ≈ 0.01–0.05 → автокорреляция частично присутствует

ARCH тест:¶

  • США: ARCH p ≈ 0.10–0.30 → слабая гетероскедастичность
  • РФ: ARCH p ≈ 0.01–0.07 → значимая гетероскедастичность

Нормальность остатков (Anderson–Darling):¶

  • p-values < 0.05 почти везде (финансовые ряды редко нормальны).

Вывод:¶

VAR/VECM корректны по автокорреляции для США, но РФ имеет остаточную зависимость и ARCH эффект(условная дисперсия), что ухудшает прогноз.


Прогнозирование (10 / 20 / 100 дней)¶

10-дневный горизонт¶

  • VARMAX лучший по RMSE в 5 из 6 секторов
  • ошибку можно охарактеризовать как низкую (RMSE ≈ 0.005–0.015)

20-дневный горизонт¶

  • VECM начинает выигрывать там, где rank > 0
  • VAR начинает проигрывать по волатильности

100-дневный горизонт¶

  • VAR и VARMAX становятся чрезмерно сглаженными (почти линейная траектория)
  • VECM → единственная модель, показывающая экономически осмысленное поведение (возврат к коинтеграции)

Сравнение по RMSE и MAE¶

Итоги по всем секторам:¶

Сектор Лучшая модель Причина
rus_fin VARMAX отсутствует коинтеграция, короткие ряды
rus_oil VECM rank = 1, сильная долгосрочная связь
rus_met VAR/VECM (оба) волатильность, слабее связь
us_tech VARMAX сильная краткосрочная MA-структура
us_retail VECM долгосрочная связь сетей дистрибуции
us_energy VARMAX нет коинтеграции

Основные выводы¶

  1. Доходности стационарны, уровни — нет (ADF/KPSS подтверждают).
  2. Коинтеграция найдена в 4 из 6 секторов, что делает VECM оправданной.
  3. VARMAX лучший на краткосрочных горизонтах (10–20 дней), особенно в США.
  4. VECM лучший на 100-дневном горизонте, т.к. корректно учитывает долгосрочные соотношения.
  5. Российские ряды страдают от:
    • разрывов данных
    • скачков волатильности
    • структурных шоков
  6. Американские ряды:
    • стабильные
    • хорошо моделируются
    • дают узкие Monte-Carlo интервалы
  7. ARCH эффект присутствует почти во всех секторах РФ → важно отмечать в отчёте.
  8. Прогнозы на 100 дней у VAR/VECM часто сглажены — это не ошибка, а особенность линейных моделей.

Приложения¶

1. Средние корреляции:¶

  • rus_fin: 0.764
  • rus_met: 0.511
  • us_tech: 0.878 (по средним значениям матрицы)

2. Примеры ADF/KPSS значений:¶

  • CHMF: ADF p = 1.23e-16
  • TCSG: ADF p = 1.13e-29
  • GMKN: ADF p = 0, KPSS p = 0.1

3. Johansen rank:¶

  • (1) — rus_oil, rus_met, us_tech, us_retail
  • (0) — rus_fin, us_energy

Таким образом, проведённое исследование подтверждает, что корректный выбор модели с учётом свойств данных (стационарности, коинтеграции, гетероскедастичности и структуры лагов) является ключом к получению статистически обоснованных и экономически интерпретируемых прогнозов, а комплексное использование VAR, VARMAX и VECM формирует устойчивую методологию анализа многомерных финансовых временных рядов.